讲座题目:Unsupervised Federated Learning: A Federated Gradient EM Algorithm for Heterogeneous Mixture Models with Robustness against Adversarial Attacks
主讲人:Yang Feng 教授
主持人:周勇 教授
开始时间:2024-07-29 09:00
讲座地址:普陀校区理科大楼A1514
主办单位:统计学院
报告人简介:
Yang Feng is a Professor of Biostatistics at New York University. He obtained his Ph.D. in Operations Research at Princeton University in 2010. Feng’s research interests encompass the theoretical and methodological aspects of machine learning, high-dimensional statistics, network models, and nonparametric statistics, leading to a wealth of practical applications. He has published more than 70 papers in statistical and machine learning journals. His research has been funded by multiple grants from the National Institutes of Health (NIH) and the National Science Foundation (NSF), notably the NSF CAREER Award. He is currently an Associate Editor for the Journal of the American Statistical Association (JASA), the Journal of Business & Economic Statistics (JBES), and the Annals of Applied Statistics (AoAS). His professional recognition includes being named a fellow of the American Statistical Association (ASA) and the Institute of Mathematical Statistics (IMS), as well as an elected member of the International Statistical Institute (ISI).
报告内容:
While supervised federated learning approaches have enjoyed significant success, the domain of unsupervised federated learning remains relatively underexplored. In this paper, we introduce a novel federated gradient EM algorithm designed for the unsupervised learning of mixture models with heterogeneous mixture proportions across tasks. We begin with a comprehensive finite-sample theory that holds for general mixture models, then apply this general theory on Gaussian Mixture Models (GMMs) and Mixture of Regressions (MoRs) to characterize the explicit estimation error of model parameters and mixture proportions. Our proposed federated gradient EM algorithm demonstrates several key advantages: adaptability to unknown task similarity, resilience against adversarial attacks on a small fraction of data sources, protection of local data privacy, and computational and communication efficiency.