X
首页 >> 学术报告 >> 正文

学术报告:郭琛 华东师范大学

2025年10月15日 09:26  点击:[]


主讲人:郭琛 华东师范大学

目:A synthetic subsampling and estimation procedure for imbalanced big data

时间:2025年10月18日 15:50-17:20

地点:VSport体育官网新校园 B514

摘要:Abstract Massive datasets with imbalanced binary outcomes are commonly seen in many areas. Existing optimal subsampling strategies largely overlook the binary and imbalance structure, leading to efficiency loss, and are usually built on inverse probability weighting (IPW), which is unstable if some probabilities are close to zero. In this paper, we propose a synthetic sampling and estimation procedure tailored for imbalanced big data. In the sampling stage, we derive the optimal case-control subsampling plan based on IPW. To overcome the instability of IPW for estimation, we propose a novel empirical likelihood weighting (ELW) method based on a case-control sample. A real-data-based simulation study indicates that our synthetic subsampling and estimation procedure has smaller mean square error than existing estimation procedures.

Keywords: Optimal subsampling, case-control sampling, empirical likelihood weighting, imbalanced data.


上一条:学术报告:朱文圣 云南大学 下一条:学术报告:孙志斌 南京师范大学

关闭