Improving Robustness of the Model-X Inference with Application to EHR Studies
时间:2025-09-22
阅读量:127次
- 演讲人: 刘默雷(北京大学,助理教授)
- 时间:2025年9月26日14:00
- 地点:浙江大学紫金港校区行政楼1417报告厅
- 主办单位:浙江大学数据科学研究中心
Abstract:The model-X conditional randomization test (CRT) is a flexible and powerful testing procedure for conditional independence testing. However, it requires perfect knowledge of the exposure X’s conditional distribution and may lose its validity when there is an error in modeling X. This problem is even more severe when the adjustment covariates Z are high-dimensional. To address this challenge, we propose the Maxway CRT, which learns the conditional distribution of the response Y and uses it to calibrate the resampling distribution of X. We prove that the type-I error inflation of the Maxway CRT can be controlled by the learning error for a low-dimensional adjusting model plus the product of learning errors for X | Z and Y | Z, interpreted as an “almost doubly robust” property. Based on this, we develop implementing algorithms of the Maxway CRT in practical scenarios including surrogate-assisted semi-supervised learning and transfer learning. We apply our methodology to two real-world studies on electronic health record and biobank data.
BiO: 刘默雷,北京大学医学部与国际数学中心双聘研究员兼助理教授,入选国家级高层次人才计划青年项目。2022年获美国哈佛大学生物统计学博士学位,2022至2024年在美国哥伦比亚大学任助理教授。主要统计学理论与方法研究方向包括高维复杂数据推断、数据融合、半监督学习、迁移学习、分布鲁棒优化等,论文成果发表于J. Royal. Stat. Soc:B, J. Am. Stat., Biometrika, J. Mach. Learn. Res等领域内顶级期刊。同时,在生物医学信息学、基因组学等应用领域开展广泛的合作研究,成果发表于Science, npj. Digi. Med.等权威期刊。曾获哥伦比亚大学公共卫生学院颁发的Sanford Bolton学者奖。