Powerful, scalable and resource-efficient meta-analysis of rare variant associations in large whole genome sequencing studies
作者:
时间:2024-04-25
阅读量:549次
  • 演讲人: 李子林 教授 (东北师范大学数学与统计学院)
  • 时间:2024年5月7日15:30(北京时间)
  • 地点:浙江大学紫金港校区行政楼1417报告厅
  • 主办单位:浙江大学数据科学研究中心

Abstract: Large-scale whole genome/exome sequencing (WGS/WES) studies have enabled the analysis of rare variants (RVs) associated with complex human traits. Existing RV meta-analysis approaches are not scalable when applied to WGS/WES data. We propose MetaSTAAR, a powerful and resource-efficient RV meta-analysis framework, for large-scale WGS association studies. MetaSTAAR accounts for population structure and relatedness for both continuous and dichotomous traits. By storing LD information of RVs in a new sparse matrix format, the proposed framework is highly storage efficient and computationally scalable for analyzing large-scale WGS/WES data without information loss. Furthermore, MetaSTAAR dynamically incorporates multiple functional annotations to empower RV association analysis, and enables conditional analyses to identify RV-set signals independent of nearby common variants. We applied MetaSTAAR to identify RV-sets associated with four quantitative lipid traits in 30,138 related samples from the NHLBI TOPMed Program Freeze 5 data, consisting of 14 ancestrally diverse studies and 255 million variants in total, as well as the UK Biobank WES data of ~200,000 related samples. MetaSTAAR requires at least 100 times storage smaller and computationally faster than existing methods. Compared to the joint analysis of pooled individual-level data using STAAR, the P-values from MetaSTAAR and STAAR are highly concordant, with correlation > 0.99 among significant RV-sets. Additionally, MetaSTAAR identified and replicated several conditionally significant RV associations with blood lipids levels.



Short Bio: 李子林,东北师范大学数学与统计学院教授。历任印第安纳大学医学院生物统计与健康数据科学系助理教授,哈佛大学生物统计系博士后、副研究员和研究员。本科与博士毕业于清华大学数学科学系,师从美国国家科学院与医学院两院院士林希虹院士。2023年当选为国际统计学会(International Statistical Institute)推选会员(Elected Member),获阿里巴巴达摩院青橙奖“最具潜力奖”。主要研究方向为高维数据中的统计方法理论和统计遗传学。相关研究成果以第一作者或通讯作者在Nature Methods、Nature Genetics、JASA等国际学术期刊发表。