Efficient designs and analysis of two-phase studies with longitudinal binary data
作者:
时间:2024-05-09
阅读量:200次
  • 演讲人: 陶然(范德堡大学)
  • 时间:2024年5月28日15:30(北京时间)
  • 地点:浙江大学紫金港校区行政楼1417报告厅
  • 主办单位:浙江大学数据科学研究中心

摘要:

Researchers interested in understanding the relationship between a readily available longitudinal binary outcome and a novel biomarker exposure can be confronted with ascertainment costs that limit sample size. In such settings, two-phase studies can be cost-effective solutions that allow researchers to target informative individuals for exposure ascertainment and increase estimation precision for time-varying and/or time-fixed exposure coefficients. In this paper, we introduce a novel class of residual-dependent sampling (RDS) designs that select informative individuals using data available on the longitudinal outcome and inexpensive covariates. Together with the RDS designs, we propose a semiparametric analysis approach that efficiently uses all data to estimate the parameters. We describe a numerically stable and computationally efficient EM algorithm to maximize the semiparametric likelihood. We examine the finite sample operating characteristics of the proposed approaches through extensive simulation studies, and compare the efficiency of our designs and analysis approach with existing ones. We illustrate the usefulness of the proposed RDS designs and analysis method in practice by studying the association between a genetic marker and poor lung function among patients enrolled in the Lung Health Study.



报告人简介:

Dr. Ran Tao is an associate professor in the Department of Biostatistics and Vanderbilt Genetics Institute at Vanderbilt University Medical Center. He received his BS degree in mathematics in 2010 from Tsinghua University, and then received his PhD degree in biostatistics in 2016 from the University of North Carolina at Chapel Hill. He is interested in developing novel statistical methods to solve problems arising in the design and analysis of modern biomedical and public health studies, including genome-wide association studies, next-generation sequencing studies, and electronic health records systems. His current research topics include two-phase designs, missing data, measurement error, trans-ethnic genetic association analysis, and genetic risk prediction. His research has led to over 90 publications in high-profile statistical journals like Journal of the American Statistical Association and Biometrics and scientific journals like Nature, Nature Neuroscience, JAMA Internal Medicine, American Journal of Human Genetics, and Nature Communications.