6月21日 骆威:Determine the number of clusters by data augmentation

时间:2024-06-14浏览:47设置

讲座题目:Determine the number of clusters by data augmentation

主讲人:骆威 研究员

主持人:於州 教授

开始时间:2024-06-21 14:00

讲座地址:普陀校区理科大楼A1114

主办单位:统计学院


报告人简介:

       骆威,浙江大学研究员,博士生导师,于2014年毕业于美国宾夕法尼亚州立大学,之后任职于美国Baruch College,于2018年加入浙江大学数据科学研究中心。骆威博士的研究方向包括充分降维和因果推断,在Annals of Statistics, Biometrika, JRSSB等统计国际学术期刊上发表了多篇论文。


报告内容:

Determining the number of clusters is crucial for the successful application of clustering. In this paper, we propose a new order-determination method called the data augmentation estimator (DAE), for the general model-based clustering. The estimator is based on a novel idea that augments data with an independently generated small cluster, which enables us to justify how the instability of clustering changes with the number of clusters assumed in clustering. The pattern of instability provides an alternative characterization of the true number of clusters to the commonly used goodness-of-fit measure. By combining the two sources of information appropriately, the proposed estimator reaches asymptotic consistency under general conditions and is easily implementable. It is also more efficient than the conventional BIC-type approaches that use the goodness-of-fit measure only. These properties are illustrated by the simulation studies and real data examples at the end.



返回原图
/