Statistics Seminar – Christof Seiler – Connecting Human Decision Makers To Data With Statistical Models
Oct 1, 2024
3:30PM to 4:30PM
Date/Time
Date(s) - 01/10/2024
3:30 pm - 4:30 pm
Location: PGCLL M21 (Note: change of location for this date!)
Title: Connecting Human Decision Makers To Data With Statistical Models
Speaker: Christof Seiler (Department of Advanced Computing Sciences, Maastricht University & Center of Experimental Rheumatology at University of Zurich)
Abstract: In this talk, I will present two recent projects connecting decision making with data using statistical models.
The first project is about statistical modeling in single-cell biology. The task was to compensate for spillover in mass cytometry data during the pre-processing step. As this is prior to downstream analyses—dimension reduction plots and differential analyses—we aimed for a procedure that has weak distributional assumptions on the data. We implemented our method in the R package spillR using expectation-maximization, and fitted a non-parametric finite mixture model. Our model is based on clearly defined assumptions that the biologists—who are the target users—can check visually.
The second project is about predictions for a professional cycling team at the Tour de France and the Giro d’Italia. The task was to build prediction intervals with statistical guarantees for any machine learning model. We worked together with the Team Visma-Jumbo who won the Tour de France in 2022 and 2023. We built a prediction model to predict calorie consumption for each rider from race and rider characteristics. Prediction intervals through conformal prediction gave us a theoretically sound way to combine the expertise of coaches with the power of prediction models.
This is joint work with Marco Guazzini, Alexander Reisach, Sebastian Weichwald, Kristian van Kuijk, and Mark Dirksen.
References:
- Paper on spillR: https://doi.org/10.1093/bioinformatics/btae337
- Bioconductor/R package spillR: https://doi.org/doi:10.18129/B9.bioc.spillR
- Paper on prediction intervals: https://proceedings.mlr.press/v204/kuijk23a.html