Statistics seminar – Liqun Diao – Polya tree based nearest neighborhood regression
Dec 5, 2023
5:30PM to 6:30PM
Date(s) - 05/12/2023
5:30 pm - 6:30 pm
Location: University Hall (UH) 112
Speaker: Liqun Diao (University of Waterloo)
Bio: Liqun is an Assistant Professor in the Department of Statistics and Actuarial Science at the University of Waterloo. She is interested in developing statistical and machine learning methods to advance knowledge in fields including medicine, public health, and insurance. Liqun obtained her Ph.D. in Statistics – Biostatistics from the University of Waterloo in 2013. Her doctoral thesis received the 2013 Pierre Robillard Award of SSC. She worked as a postdoc in the Department of Biostatistics and Computational Biology at the University of Rochester in the US in 2014. She started to work as a research assistant professor in the Department of Statistics and Actuarial Science at U Waterloo since 2015 and a tenure-track assistant professor since 2022. She expanded her research to areas including recursive partitioning learning, causal inference, dependence modeling, Bayesian methods, and two-phase design.
Title: POLYA TREE BASED NEAREST NEIGHBORHOOD REGRESSION
Abstract: Parametric regression, such as linear regression, plays an important role in statistics. The use of parametric regression models typically involves the specification of a regression function of the covariates, the distribution of response, and the link between the response and covariates, which are commonly at risk of misspecification. In this talk, we introduce a fully non-parametric regression model, a Polya tree (PT) based nearest neighborhood regression. To approximate the true conditional probability measure of the response given the covariate value, we construct a PT-distributed probability measure of the response in the nearest neighborhood of the covariate value of interest. Our proposed method gives consistent and robust estimators and has a faster convergence rate than the kernel density estimation. We conduct extensive simulation studies and analyze a Combined Cycle Power Plant dataset to compare the performance of our method relative to kernel density estimation, PT density estimation, and linear dependent tail-free process (LDTFP). The studies suggest that the proposed method exhibits superiority to the kernel and PT density estimation methods in terms of the estimation accuracy and convergence rate and to LDTFP in terms of robustness.
Meeting speaker – sign-up sheet: https://docs.google.com/spreadsheets/d/1jhJ7zj3HJ7klb-G2VRdIjYeDolB8YE9abc9tfJFuO-g/edit#gid=0.
Date/Time: Tuesday, December 5, 2023, 3.30-5 p.m. (will bring refreshments to UH 112)