Intern: Sanjana Agarwal (Indiana University Bloomington)
The Equilo internship revolved around the project- “Reimagine the future! How can we measure the influence of relevant themes on the world?” Equilo is an app that provides a detailed snapshot of gender equality and social inclusion (GESI) in low- and middle-income countries (LMIC) globally. The data provided tells a story about the specific identity-based constraints and opportunities that individuals in a specific country face, using quantitative parity and empowerment scores that are contextualized with the country- and theme-specific qualitative information. It focuses on solution based analysis which is something I deeply value. As a data analyst intern my job was to look at different themes like climate change, unpaid care work, poverty, power and agency etc and see how closely they are influenced by gender based violence. Using statistical modelling techniques, I was able to produce multivariate models with predictive power ranging from good to weak connecting gender based violence with above themes. This knowledge highlights key areas that could be prioritized to advance gender equality.
Organization: Ameren Innovation Center
Intern: Ashley Alfred (University of Texas at Arlington)
Ameren is a Fortune 500 energy company that strives to promote energy efficiency and ensure safety in the lives and properties of its customers. This summer I worked on two projects dedicated to doing just that, the Missouri Energy Efficiency (MOEE) project and the Gas Leak Detection Algorithm (GLeNDA) project. The idea of the MOEE project is to promote the purchase of energy efficiency programs to a more responsive target market and save money on marketing dollars. The goal for this summer was to create a layout detailing the process of finding the target market, so the process could be automated. The GLeNDA project is about utilizing gas usage data to find out if a customer has a slow or fast gas leak. The hope is to find which gas meters are potentially dangerous, before a customer realizes it, and send the technicians out to assess and repair. The goal this summer was to produce algorithms that would detect spikes in gas usage and determine the leak status of a meter. I worked with a team for each project, and we produced impactful results. For MOEE, we successfully created a detailed layout that was instrumental in creating an automated process. For GLeNDA, we successfully created three algorithms that together has high accuracy and precision in determining if a meter is leaking or not.
Organization: The Bee Corp
Intern: Vinicius Ambrosi (Indiana University)
The yield of almond orchards depends significantly on how effectively they are pollinated. One important measure in that regard is the strength of the beehives used during pollination season. Thus, it is essential for the almond grower and the beekeeper renting out the hives to measure how many bees there are in a given beehive. Usually, this grading is done by manual inspections, which can be time consuming and invasive, so the Bee Corp developed an app that uses infrared pictures to grade hives. My project focused on improving the part of this app responsible for identifying beehives in an image. I used machine learning techniques to identify pixels corresponding to hives and I also suggested and evaluated possible solutions to improve segmentation in low-light images.
Organization: Procter & Gamble Smart Lab
Intern: Ahmet Özkan Demir (University of Illinois Chicago)
In the consumer research conducted by Smart Lab, one of the sources of data is the images of boxes with several full and empty rolls of paper towel. The goal of this project is to build a model that counts full and empty rolls in an image; and integrate it to the system that analyzes all the images coming from the study. As a result of the project, we built an Object Detector with two classes (full and empty rolls), that gives the pixel-wise location of the detected objects - called masks - and their class scores. We also inspected the prediction accuracy of the model on different cases, and gave guidance on how the images should be taken in order to increase the accuracy of the model.
Intern: Kelvin Guilbault (Indiana University)
Within my role on the Advanced Analytics and Artificial Intelligence (AAI) team, I was responsible for two projects. My main project involved creating a tool to select an optimal subset of customers for an engine field test in order to maximize the visibility of a variety of "new, unique, and difficult" engine faults. Given an input data set of customers operating commercial trucking fleets with at least 50 vehicles with Cummins equipment, the probability that a given engine fault would occur during a field test was estimated based on past warranty claims and telematics data, giving each fleet a unique signature of probabilities across the various types of engine faults. Using this input data I developed a linear optimization algorithm to select a subset of n customer fleets such that a field test will provide as broad a coverage across the various engine faults as possible. For my secondary project, I worked with a team whose goal was to forecast Cummins’s future market share among various companies operating commercial trucking fleets, in order to generate leads for the sales division at Cummins. As a part of this effort, I was responsible for cleaning a data set of vehicle registrations and identifying a set of features to use as predictors for next year’s market share.
Intern: Margaret Hoeller (University of Illinois Chicago)
The path from hypothetical drug to FDA-approved therapy is a long road full of costly tests and trials. To shorten this process, modelling and simulation are used to predict effects of theoretical drugs. For typical drugs which react only with their target in order to produce a therapeutic effect, the models used to simulate mechanism of action are relatively straightforward and well understood. However, there is a new class of drugs which act by binding to a a target protein and a ligase, which induces target degradation, known as "degraders". The underlying ternary complex formation and degradation kinetics increases the complexity of mechanistic models. To better understand the relationship between the individual system and drug-dependent parameters and the efficacy of a degrader, we created five simulation tools and investigated qualitative and quantitative relationships in these mechanistic model systems.
Intern: Phuong (Sophie) Le (University of Illinois Urbana-Champaign)
Over the last few years, deep convolutional neural networks have achieved success in many medical imaging tasks. In this project, we explore the possibility of using the state-of-the-art model U-Net with transfer learning in a segmentation task, which aims to identify multiple types of defects on skin tissues. In addition, we propose a morphology approach which employs tools from topology and image pre-processing to detect regions of interest on these skin tissues. This method can be incorporated into our framework to speed up the inference time as well as improve the model’s performance in cases where we lack annotations of defects. We show that the combination of U-Net, transfer learning, and techniques such as customized data augmentation can attain good segmentation results on defect Residual Epi even under limited training data constrains. Finally, we demonstrate the effectiveness of our morphology approach in finding regions of the tissues containing harder-to-detect defects such as stretchmarks, holes, or scars.
Organization: Schnucks Markets
Intern: Jiaqi Li (Washington University in St. Louis)
Learning customers better is a high-level goal of the majority of businesses, espe- cially for the retailing companies. My project at Schnucks is aimed to design more tailored campaigns for customers with different personal backgrounds, like income, household size, cooking habits etc, attracting them to join the campaigns and redeem more coupons, and therefore to augment the profit of the company. My job consists of three main parts. The first and the most critical one is to understand both the data set that the company already used and a new third-party data set that was going to be focused on. Some basic statistical analysis was done in this part to get an intuitive sense of the potentially effective variables from the new data. After data understanding, I modified the random forest model used in the AI Engine and improved the prediction accuracy by adding the new columns. At last, the results from the machine learning model were interpreted from a business perspective and the whole project was deliverable to release better campaigns accordingly.
Organization: Schnucks Markets
Intern: Cezareo Rodriguez (Washington University in St. Louis)
Forecasting is a classic problem in the grocery industry in which a company seeks to model the demand of the products they sell. This is typically done at the product level or store level to ensure that a product or set of products is stocked properly to meet customer demand. Modern machine learning frameworks allow us to model using many more features than classical methods, and may allow for personalized customer demand models. For my summer internship project with INMAS and Schnucks, I was tasked with developing a customer level forecast machine learning framework. Given a customer’s purchase history and other features, my goal was to predict when the customer would return to Schnucks, and what items will be purchased during that visit. The framework developed this summer is a foundational work which the company can further improve into a complete deployable machine learning architecture.
Organization: Schneider Transportation
Intern: Souktik Roy (University of Illinois Urbana-Champaign)
The bulk division of Schneider National requested a new pricing system. We started the process of developing one in this internship, coming up with code to predict success for bids at given prices using machine learning techniques.
Organization: Ameren Innovation Center
Intern: Sarah Simpson (University of Illinois Urbana-Champaign)
Ameren holds the role of providing reliable energy to its many customers, with which comes the responsibility of mimimizing the risk of equipment malfunction or other incidents which could put customers in danger. This summer I worked on two projects associated with this responsibility. The goal of the Consequence of Failure (COF) project was to assign values to each area quantifying the potential financial cost and risk of injury if a leak were to occur there. This information is valuable to prioritize maintenance checks to prevent leaks which may be more costly. The task of the Gas Leak Detection Algorithm (GLeNDA) project was to create an algorithm to detect changes in gas meter reading data that are indicative of leaks, and send that information to technicians who can then go resolve the issue. For each of these projects I worked with a team of other interns to apply machine learning algorithms and statistical methods to find a solution to the problem at hand.
Organization: Innovations Foresight
Intern: Tyler Williams (Washington University in St. Louis)
Neural networks were developed as a way to simulate the processing and adaptive nature of the brain. These networks constitute just a portion of the machine learning methods that have applications in data analysis and prediction. Several machine learning software libraries (TensorFlow, PyTorch, etc.) exist to create and train such networks. Innovations Foresight already leverages many of these machine learning techniques in their telescopes and space products to process large databases. The aim of our project is to use TensorFlow to build upon and enhance these techniques to improve efficiency of their products.