Statistics Seminar - Sanjeena Dang - Modeling Multivariate Counts from Modern Biological Data
Title: Modeling Multivariate Counts from Modern Biological Data
Speaker: Sanjeena Dang (Carleton University)
Multivariate count data are commonly encountered through high-throughput sequencing technologies in bioinformatics. Although the Poisson distribution seems a natural fit to these count data, its multivariate extension is computationally expensive. Hence, independence between genes is assumed in most cases and this fails to take into account the correlation between genes. Recently, mixtures of multivariate Poisson lognormal (MPLN) models have been used to analyze these multivariate count measurements efficiently. In the MPLN model, the counts, conditional on the latent variable, are modeled using a Poisson distribution and the latent variable comes from a multivariate Gaussian distribution. Due to this hierarchical structure, the MPLN model can account for over-dispersion as opposed to the traditional Poisson distribution and allows for correlation between the variables. Here, a parsimonious family of mixtures of Poisson log-normal distributions are proposed by decomposing the covariance matrix and imposing constraints on these decompositions.
Date/Time: Tuesday November 9, 2021, 3:30 - 4:30
Location: VirtualJoin Zoom Meeting
Meeting ID: 971 9900 3250