Department of Statistics Event Calendar

View Full Calendar

Undergraduate Research Experience in Statistics (URES) Symposium

Event Type
Conference/Workshop
Sponsor
Department of Statistics
Location
3269 & 4269 Beckman Institute
Date
May 2, 2023   11:00 am - 2:00 pm  
Views
87

11:00 AM - 12:00 PM - 3269-3rd Floor Tower Room & 4269-4th Floor Tower Room

12:00 PM - 1:00 PM - Lunch in the Beckman Cafe

1:00 PM - 2:00 PM - 3269-3rd Floor Tower Room

11:00 - 11:10

Welcome

Beckman Institute, Room 3269

11:15 - 12:00

Parallel Sessions

 

Chair: Professor Lelys Bravo De Guenni

Beckman Institute, Room 3269

Chair: Professor Jingbo Liu

Beckman Institute, Room 4269

11:15 - 11:30

Probabilistic changes of extreme rainfall projections under different climate change scenarios

Junan Jiang, Alice Cao

Exploring college students' understanding of randomness

Zean Li, Wenqi Zeng

11:30 - 11:45

Quantifying sources of uncertainty in future precipitation projections under climate change for Champaign County

Hye Rim Ahn

A comparative study of differential abundance tests in microbiome compositional data analysis

Heqi Yin

 

11:45 - 12:00

Assessing the impact of environmental factors on malaria infections in a Brazilian location

Mingrui Xu

Nowcasting lightning strikes through convolutional neural networks

Austin Shwatal 

1:00 - 2:00

Parallel Sessions

 

Chair: Professor Dan Eck

Beckman Institute, Room 3269

 

1:00 - 1:15

Simulating the 2023 MLB season 

Jack Banks, Michael Escobedo

 

1:15 - 1:30

Sector-specific stock price forecasting: A comparative analysis of time series models for S&P 500 industries

Mingli Xu, Harsh Patel

 

1:30 - 1:45

Nowcasting lightning strikes through generalized linear mixed model and binary hurdle model

Spencer Bauer

 


Probabilistic changes of extreme rainfall projections under different Climate Change scenarios

Junan Jiang, Alice Cao

Mentor: Professor Lelys Bravo De Guenni

Given the urgency of the effects of climate change, the goal of our project is to analyze the possible impacts of greenhouse gases on extreme rainfall in Urbana-Champaign, Illinois under two different greenhouse gas scenarios as well as historical rainfall conditions for comparison. To assess future precipitation projections, we used data from 33 Global Climate Models (GCMs) for the year 2070 to 2099 under two greenhouse gas scenarios (Representative Concentration Pathways), RCP4.5 and RCP8.5, as well as model data from a historical reference period from 1950 to 2005. The data was analyzed by fitting a Generalized Extreme Value (GEV) distribution to annual monthly maxima, and model parameters were estimated by Maximum Likelihood (ML). The collected estimates were used to assess differences in the probability of extreme rainfall in Urbana Champaign under potential climate change conditions.

 

Quantifying sources of uncertainty in future precipitation projections under climate change for Champaign County

Hye Rim Ahn

Mentor: Professor Lelys Bravo De Guenni

The NASA Earth Exchange Downscaled Climate Projections (NEX-DCP30) data is used to evaluate future projections of precipitation for the period 2069-2099 from 31 different General Circulation Models (GCMs), under four different Representative Concentration Pathways (RCPs) depicting seasonal future greenhouse gases concentration trajectories. The objective of the analysis is to quantify the contribution of uncertainty in GCMs model and RCPs into the estimation of the relative change of future precipitation relative to a historical reference period 1950-2005 using linear mixed effect models. Different climate projections produced by the 31 GCMs account for the internal climate variability under the four RCPs scenarios. Further analysis on the variance contribution of the random and fixed effects for the linear mixed effect models was carried and demonstrated through calculation.

 

Assessing the impact of environmental factors on malaria infections in a Brazilian location

Mingrui Xu

Mentor: Professor Lelys Bravo De Guenni

The goal of this project is to investigate the relationship between environmental factors and malaria infections, as well as several species of malaria-carrying mosquitoes, in a specific location in Brazil. To achieve this, we calculated cross-correlations between environmental and mosquito variables to measure their association at different time lags. We used generalized linear models (GLM) to predict the number of mosquitoes based on the lagged versions of the environmental variables. We also considered mosquito abundance for different species as a predictor variable in a GLM to predict the number of malaria cases. Quasi Poisson and negative binomial families were used to account for overdispersion, and we analyzed residuals and goodness of fit to evaluate model performance. Our findings suggest that environmental variables (rainfall, river level, and relative humidity) significantly impact abundance of all mosquito types but these factors have a weaker association with the number of malaria cases.

 

Exploring college students' understanding of randomness

Zean Li, Wenqi Zeng

Mentor: Professor Kelly Findley, Professor Stephen Portnoy

The concept of randomness is ubiquitous in the field of science, especially in statistics. By understanding randomness and its characteristics, students can have a perception of the uncertainty and unpredictability of a specific event. However, misconceptions about randomness are pervasive among students; and we noticed that few experiments on investigating students’ understanding of randomness were conducted on college students. Therefore, we want to design an experiment to see how college students across different majors and school years define and understand randomness. We specifically wish to test their understanding in several areas: 1) their ability to distinguish random events, 2) their ability to distinguish between a random sequence and a non-random sequence, 3) their understanding of the variation of random samples, 4) randomness in uniform and non-uniform distributions. We designed a questionnaire for students to complete, which includes both content questions about randomness, as well as tasks where students are asked to choose points on a line or on a shape as randomly as they can. In our presentation, we will present our findings about how the students performed and whether we found associations between their content understanding and their performance on the clicking tasks.

 

A comparative study of differential abundance tests in microbiome compositional data analysis

Heqi Yin

Mentor: Professor Shulei Wang

Differential abundance analysis in compositional data is one of the most important tools in microbiome data analysis. However, the presence of compositional constraints and zero counts poses significant challenges to existing methods. We design simulation experiments to compare several popular methods, including Zicoseq and ANCOM-BC.

 

Nowcasting lightning strikes through convolutional neural networks

Austin Shwatal

Mentor: Dr. Daniel Ries, Sandia National Laboratories

Lightning is one of the most common weather events in the United States, causing significant damage to life and property every year. The National Ocean and Atmospheric Association (NOAA) has tracked the locations and intensity of lightning, as well as monitored general atmospheric conditions across the United States. Recent work, including by Cintineo et al. (2022) has been devoted to the goal of immediate weather prediction using this real-time data, often referred to as “nowcasting”. The data used for this project includes multi-spectral bands from the GOES-16 satellite’s Advanced Baseline Imager (ABI), and lightning strike information from the National Lightning Detection Network. We focus our analysis on the upper Midwest. We apply U-nets, originally developed for biomedical image segmentation, to predict lightning from radiance data. The U-net is designed to rapidly deconstruct a complex image into identifiable features without the loss of image resolution, a key feature that allows precise geographic prediction accuracy. The U-net takes inputs as multispectral images, and outputs images quantifying the probability of lightning occurring in a particular region. The trained model produces predicted probabilities of lightning strikes, allowing preparation for severe weather events.

 

Simulating the 2023 MLB season

Jack Banks, Michael Escobedo

Mentor: Professors Daniel Eck, David Dalpiaz

Full season simulators play a crucial role in assisting the day-to-day operations of a Major League Baseball organization. In this project, we used data from the 2015-2022 season to construct a simulator for the Chicago Cubs. Individual player talents are derived from a justified regression model to predict weighted on-base average (wOBA). The baseball statistic wOBA is one of the most accurate measurements of a player’s true value in runs, combining a player’s outcomes with their mean expected run values. These individual player talents are then aggregated based on lineups for a specific matchup, and an elo system is run to simulate the outcome and changes in team performance. An elo system allows us to measure the relative skill of teams over time. As a result, our expected outcome of the 2023 MLB season is based upon a simulation system, dependent upon wOBA, that adjusts the talent of each team over time. This project is formatted into an R package, allowing for smooth transition from our computers to the front office of the Chicago Cubs.

 

Sector-specific stock price forecasting: A comparative analysis of time series models for S&P 500 industries

Mingli Xu, Harsh Patel           

Mentor: Professor Hyoeun Lee

The stock market is a complex system that is constantly changing, and accurately predicting stock prices is a difficult task. In this study, we aim to identify the best time series model for predicting stock prices of different sectors in the S&P 500. We compare traditional models, including ARIMA, ARCH, and GARCH models, to the more recent learning methods such as LSTM, and determine which model performs the best for each sector. Our findings provide insights into effective modeling techniques for predicting stock prices of different business sectors, which can be useful for investors and analysts in making informed investment decisions.


Nowcasting lightning strikes through generalized linear mixed model and binary hurdle model

Spencer Bauer

Mentor: Dr. Daniel Ries, Sandia National Laboratories

We are using a hurdle model to account for a large number of zeros and overdispersion in lightning counts since the frequency of zeros accounts for 94.4% of the data. The hurdle model consists of two parts: a binary hurdle part that models the probability of lightning events and a truncated count part that models the number of flashes given that a lightning event occurs. The binary hurdle part uses a Bernoulli distribution for the probability of non-zero lightning event occurrence. The truncated count part uses a zero-truncated negative binomial distribution with additive predictors for location (θ>0) and dispersion (μ>0) to model the number of flashes. The modeling and statistical analysis will be done in RStudio (2021 version). The main packages we will use are sp and countreg. We will use the sp package for creating spatial objects, or polygons to be specific, and we will use the countreg package for the zero-truncated negative binomial distribution and plotting.

link for robots only