Background

Talks in the Statistics and Data Science Colloquium are free and open to the public. They are intended to be accessible to a broad audience with some background in statistics and data science. Junior and senior statistics majors are expected to attend talks in the SDS Colloquia. Please reach out to Professor Nicholas Horton in case of conflicts.

Data Science Initiative

Information about talks and and events sponsored by the Amherst College Data Science Initiative can be found here.

Upcoming talks

Tuesday March 5, 2024: Francesca Dominici (Harvard University)

Clarence James Gamble Professor of Biostatistics, Population, and Data Science, Harvard University TH Chan School of Public Health

Title: Decoding climate vulnerability: Harnessing the power of data science and causal inference

Abstract: Air pollution and climate change are two sides of the same coin. Pollutants emitted in the air can lead to changes in climatic conditions. These emissions consist of greenhouse gases. Specific components of particulate matter can either warm or cool the temperature. Short-lived climate pollutants are also dangerous air pollutants that harm people, ecosystems, and agricultural productivity. On January 6, 2023, the Environmental Protection Agency (EPA) announced a proposal to lower the National Ambient Air Quality Standard (NAAQS) for annual PM2.5 pollution from 12 μg/m3 to between 9 and 10 μg/m3, though it continues to consider other options. Data science must inform this decision. In this talk, I will provide an overview of data science methods, including methods for causal inference and machine learning, with the lens of policy change. This is based on a large effort of analyzing a data platform of unprecedented size and representativeness. The platform includes more than 600 million observations on the health experience of over 95% of the US population over 65 years old linked to air pollution exposure and several confounders. I will also provide an overview of studies on air pollution exposure, environmental racism, wildfires, and how they can exacerbate vulnerability to COVID-19. Swift action on reducing short-lived climate forcers such as methane, tropospheric ozone, hydrofluorocarbons, and black carbon can significantly decrease the chances of triggering severe climate tipping points.

Bio: Francesca Dominici, PhD is the co-Director of the Harvard Data Science Initiative, at Harvard University and the Clarence James Gamble Professor of Biostatistics, Population and Data Science at the Harvard T.H. Chan School of Public Health. She is an elected member of the National Academy of Medicine and of the International Society of Mathematical Statistics. She leads an interdisciplinary group of scientists to address important questions in environmental health science, climate change, and health policy. She has published over 280 peer-reviewed published articles, and has provided her knowledge on the topics on joint panels with New Jersey Senator Cory Booker, and European Commission). Dr. Dominici has provided the scientific community and policy makers with comprehensive and compelling evidence on the adverse health effects of air pollution, noise pollution, and climate change. Her studies have directly and routinely impacted air quality policy. Dr. Dominici was recognized in Thomson Reuter’s 2019 list of the most highly cited researchers–ranking in the top 1% of cited scientists in her field. Her work has been covered by the New York Times, the Los Angeles Times, BBC, the Guardian, CNN, and NPR. In April 2020 she has been awarded the Karl E. Peace Award for Outstanding Statistical Contributions for the Betterment of Society by the American Statistical Association. She is an advocate for the career advancement of women faculty, and her work on the Johns Hopkins University Committee on the Status of Women earned her the campus Diversity Recognition Award in 2009. At the Harvard T.H. Chan School of Public Health, she has led the Committee for the Advancement of Women Faculty.

  • 11:45am refreshments in Seeley Mudd 206
  • noon talk Seeley Mudd 206 (Amherst College, 31 Quadrangle Drive)
  • masks welcomed but not required

Previous talks

Wednesday, September 13, 2023: Becky Tang (Middlebury College)

Title: Mechanistic modeling of climate effects on redistribution and population growth

  • 4:15pm refreshments, 4:30pm talk
  • Amherst College Seeley Mudd 206

Abstract: Understanding community responses to climate is critical for anticipating the future impacts of global change. However, despite increased research efforts in this field, models that explicitly include important biological mechanisms are lacking. Quantifying the potential impacts of climate change on species is complicated by the fact that the effects of climate variation may manifest at several points in the biological process. To this end, we formulate a dynamic mechanistic model that combines population dynamics, such as species interactions, with species redistribution by allowing climate to affect both processes. We examine their relative contributions in an application to the changing biomass of a community of eight species in the Gulf of Maine using over 30 years of fisheries data from the Northeast Fishery Science Center. Our model suggests that the mechanisms driving biomass trends vary across space, time, and species.

Thursday, November 9, 2023: Kate Moore (Amherst College)

Title: Data cohesion: from similarity comparisons to clustering

  • 4:15pm refreshments, 4:30pm talk
  • Amherst College Seeley Mudd 206

We often want to observe the shape of our data and will use clustering and data visualization methods to do so. These methods typically require that our data is described with respect to a relatively small set of variables or that we provide distances among all pairs of points. For many interesting problems, however, this initial step can be quite challenging. In such a case, we may instead wish to work from a set of responses to similarity comparisons (e.g., among x, y, and z, which one is the outlier?). I will introduce cohesion, a new measure of relative proximity that is built on this comparison framework. Cohesion offers a perspective on our data that is quite different from distance alone and can help address challenges that arise in high-dimensional settings. I will also share some initial progress toward the development of cohesion-based methods for clustering and data visualization.

Thursday February 1, 2024: Krista Gile (University of Massachusetts, Amherst)

Title: Bayesian resolution of discrepant self-reported network ties

Abstract: Social network analysis facilitates a deeper understanding of underlying relationships and structures. Most social network analysis assumes an objective network of shared social ties, typically measured as self-reports from research subjects. Although it is common for two parties to give discrepant reports of their shared relationship status, there is no standard way to resolve such discrepancies. We develop a Bayesian model that leverages patterns of agreement among respondents across multiple relations, using flexible priors to allow for aberrant reporting behaviors. The model allows for posterior inference for individual reporter error rates and for the underlying true network. The method is motivated by and applied to the Food, Activity, Screens, and Teens (FAST) study, an investigation of social networks and health behavior among U.S. middle school students.

This work is joint with Maryclare Griffin, Dongah Kim, James Kitts, David Nolin, and John Sirard.

Bio: Krista Gile is a professor of statistics in the UMass Amherst Department of Mathematics and Statistics. She earned her PhD in Statistics from the University of Washington in 2008, completing the Social Science Statistics Track. After a postdoc at social science Nuffield College, Oxford, she joined UMass in 2010 as part of the initial cluster in computational social science. Her research focuses on developing statistical methodology for social and behavioral science research, particularly related to making inference about hard-to-reach human populations and from partially-observed social network structures. Much of her work is focused on understanding the strengths and limitations of data sampled with link-tracing designs such as snowball sampling, contact tracing, and respondent-driven sampling.

  • 4:15 refreshments in Seeley Mudd 206
  • 4:30pm talk Seeley Mudd 206 (Amherst College, 31 Quadrangle Drive)

Wednesday February 21, 2024: Matteo Riondato (Amherst College)

Title: Sampling binary matrices with hard constraints: algorithms and impossibility results

Abstract: Given an observed binary matrix, generating other matrices that share some properties with the observed one is a key step for performing statistical hypothesis tests on results obtained from the observed matrix. The set of properties to be maintained defines a sample space of matrices, and the key computational question is how to draw samples from this space according to a user-specified distribution. We show that, for some sets of properties, there are efficient Markov-chain-Monte-Carlo algorithms to generate these samples, while in other cases such algorithms cannot exist. This talk is based on joint works with Giulia Preti and Gianmarco De Francisci Morales from the CentAI Institute.

Bio: Matteo Riondato is an associate professor of computer science at Amherst College, and a visiting faculty at Brown University. Previously, he was a research scientist at Two Sigma. His research focuses on algorithms for data mining and machine learning. He received a NSF CAREER award to work on statistically-sound knowledge discovery from data. His works were recognized with best-of-conference awards at many data mining conferences.

  • 4:15 refreshments in Seeley Mudd 206
  • 4:30pm talk Seeley Mudd 206 (Amherst College, 31 Quadrangle Drive)
  • masks welcomed but not required

Logistics

Seeley Mudd Hall is located at the southwest corner of the first year Quadrangle (31 Quadrangle Drive). Paid parking is available at the Amherst Town Common and Boltwood Drive (approximately 8 minute walk). PVTA Bus Service is available from the Converse Hall stop.

Last updated February 25, 2024

Copyright © 2024 Amherst College. All rights reserved.