Statistics Seminar - Paul McNicholas - oclust: using subset log-likelihoods to trim outliers in Gaussian mixture models
- Calendar
- Mathematics & Statistics
- Date
- 12.01.2020 3:30 pm - 4:30 pm
Description
Title: oclust: using subset log-likelihoods to trim outliers in Gaussian mixture models
Speaker: Paul McNicholas (McMaster University)
Abstract: Mixtures of Gaussian distributions are a popular choice in model-based clustering. Outliers can affect parameters estimation and, as such, must be accounted for. Predicting the proportion of outliers correctly is paramount as it minimizes misclassification error. It is proved that, for a finite Gaussian mixture model, the log-likelihoods of the subset models are beta-distributed. An algorithm is then proposed that predicts the proportion of outliers by measuring the adherence of a set of subset log-likelihoods to a beta reference distribution. This algorithm removes the least likely points, which are deemed outliers, until model assumptions are met.
Date/Time: Tuesday December 1, 3:30 - 4:30
Location: Virtual
Join Zoom Meetinghttps://mcmaster.zoom.us/j/95079221401?pwd=ZHFQUTIwSUtHd1pGOGhzc01xei9TQT09
Meeting ID: 950 7922 1401
Passcode: 016314