Background

Talks in the Statistics and Data Science Colloquium are free and open to the public. They are intended to be accessible to a broad undergraduate audience with some background in statistics and data science. Junior and senior statistics majors are expected to attend talks in the SDS Colloquia (please reach out to Professor Nicholas Horton in case of conflicts).

Data Science Initiative

Information about talks and and events sponsored by the Amherst College Data Science Initiative can be found here.

Math Colloquium

The Department also organizes a series of talks for undergraduates: see https://npflueger.github.io/colloquium for details.

Upcoming talks

Tuesday September 24, 2024: Ben Baumer (Smith College)

Professor of Statistics and Data Science Smith College

Title: tidychangepoint: a unified framework for analyzing changepoint detection in time series

  • 4:15pm refreshments, 4:30pm talk
  • Amherst College Seeley Mudd 206

Abstract: The talk describes tidychangepoint, a new R package for changepoint detection analysis. tidychangepoint leverages existing packages like changepoint, GA, tsibble, and `broom to provide tidyverse-compliant tools for segmenting univariate time series using various changepoint detection algorithms. In addition, tidychangepoint also provides model-fitting procedures for commonly-used parametric models, tools for computing various penalized objective functions, and graphical diagnostic displays. tidychangepoint wraps both deterministic algorithms like PELT, and also flexible, randomized, genetic algorithms that can be used with any compliant model-fitting function and any penalized objective function. By bringing all of these disparate tools together in a cohesive fashion, tidychangepoint facilitates comparative analysis of changepoint detection algorithms and models. (This is joint work with Biviana Marcela Suarez Sierra.)

Bio: Ben Baumer is a data scientist, with research and teaching focused on extracting meaning from data. This interest is informed by both his graduate work, which focused on discrete mathematics and theoretical computer science, and his professional experience, where he served as the Statistical Analyst for the New York Mets from 2004 to 2012. Ben has published a wide variety of papers and textbooks in network science, sports analytics, data science education, and other related fields.

Thursday October 10, 2024: Ofer Harel (University of Connecticut)

Professor of Statistics and Dean of the College of Liberal Arts and Studies at the University of Connecticut

Title: Strategies for data analysis with two types of missing values

  • 4:30pm talk Seeley Mudd 206 (Amherst College, 31 Quadrangle Drive)

Abstract: Missing data, an issue frequently encountered in data analysis, causes difficulties with estimation, precision and inference. Methods for dealing with missing data issues have been studied extensively in the last few decades. Two types of missing values can be present in the same dataset. This talk will explore the probabilistic mechanisms generating the two types of missing values, the conditions under which these mechanisms can be partially or completely ignored, and the use of two-stage multiple imputation (MI) to address the challenge posed by incomplete observations.

Bio: Dr. Harel received his doctorate in statistics in 2003 from the Department of Statistics at the Pennsylvania State University; where he developed his methodological expertise in the areas of missing data techniques, diagnostic tests, longitudinal studies, Bayesian methods, sampling techniques, mixture models, latent class analysis, and statistical consulting. Dr. Harel has been involved with a variety of research fields including, but not limited to Alzheimer’s, diabetes, cancer, nutrition, HIV/AIDS, health disparities, anti-racism, and alcohol and drug abuse prevention.

Wednesday, November 13, 2024: Minsu Kim (University of Massachusetts, Amherst)

Title: Measuring importance in an ensemble model using Shapley values

  • 4:15pm refreshments, 4:30pm talk
  • Amherst College Seeley Mudd 206

Abstract: In the context of building probabilistic ensemble forecasts, it is important to understand the relative importance and contributions of individual models to creating a highly accurate forecast combination. We propose a practical method for evaluating the expected contribution of individual component models using a variation of the Shapley value, a concept of cooperative game theory. This approach relies on considering all possible ensemble models constructed from subsets of individual models. This study was motivated by studying forecasts submitted to the US COVID-19 Forecast Hub starting in April 2020. This modeling hub produced a probabilistic ensemble forecasting model of COVID-19 cases, hospitalizations, and deaths in the US based on individual models collected from a variety of research groups. We aim to identify which is the most “important” component model on average in helping the ensemble be more accurate. Key results from this work show that (1) the overall importance of an individual model tends to be correlated with the overall prediction accuracy of that model measured by the weighted interval score (WIS), which is a commonly used proper scoring rule for quantile forecasts, and (2) our proposed method clearly shows the contribution of individual models to a more accurate ensemble model, which is difficult to ascertain from the overall WIS alone. This study will offer insights into understanding individual forecasting models’ unique features and their roles in contributing to an ensemble model for a specific prediction task. (This work is jointy with Evan Ray and Nicholas Reich.)

Bio: Minsu Kim is a PhD student in the Department of Biostatistics and Epidemiology at the University of Massachusetts Amherst. Her research interests now include the evaluation of probabilistic forecasting models and the application of ensemble methods for infectious disease prediction. Additionally, she is keenly interested in machine learning and R package development.

Logistics

Seeley Mudd Hall is located at the southwest corner of the first year Quadrangle (31 Quadrangle Drive). Paid parking is available at the Amherst Town Common and Boltwood Drive (approximately 8 minute walk). PVTA Bus Service is available from the Converse Hall stop (approximately 5 minute walk).

Last updated September 30, 2024

Copyright © 2024 Amherst College. All rights reserved.