Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale

Authors

Xihao Li, Harvard T.H. Chan School of Public Health
Zilin Li, Harvard T.H. Chan School of Public Health
Hufeng Zhou, Harvard T.H. Chan School of Public Health
Sheila M. Gaynor, Harvard T.H. Chan School of Public Health
Yaowu Liu, Southwestern University of Finance and Economics
Han Chen, University of Texas School of Public Health
Ryan Sun, University of Texas MD Anderson Cancer Center
Rounak Dey, Harvard T.H. Chan School of Public Health
Donna K. Arnett, University of Kentucky
Donna K. Arnett, University of Kentucky
Stella Aslibekyan, The University of Alabama at Birmingham
Stella Aslibekyan, The University of Alabama at Birmingham
Christie M. Ballantyne, Baylor College of Medicine
Lawrence F. Bielak, University of Michigan, Ann Arbor
Lawrence F. Bielak, University of Michigan, Ann Arbor
John Blangero, University of Texas Rio Grande Valley
John Blangero, University of Texas Rio Grande Valley
Eric Boerwinkle, University of Texas School of Public Health
Eric Boerwinkle, University of Texas School of Public Health
Donald W. Bowden, Wake Forest School of Medicine
Donald W. Bowden, Wake Forest School of Medicine
Jai G. Broome, University of Washington, Seattle
Jai G. Broome, University of Washington, Seattle
Matthew P. Conomos, University of Washington, Seattle
Adolfo Correa, University of Mississippi Medical Center
L. Adrienne Cupples, School of Public Health
Joanne E. Curran, University of Texas Rio Grande Valley
Barry I. Freedman, Wake Forest School of Medicine
Xiuqing Guo, Harbor-UCLA Medical Center
George Hindy, College of Medicine
Marguerite R. Irvin, The University of Alabama at Birmingham
Sharon L.R. Kardia, University of Michigan, Ann Arbor

Document Type

Journal Article

Publication Date

9-1-2020

Journal

Nature Genetics

Volume

52

Issue

9

DOI

10.1038/s41588-020-0676-4

Abstract

© 2020, The Author(s), under exclusive licence to Springer Nature America, Inc. Large-scale whole-genome sequencing studies have enabled the analysis of rare variants (RVs) associated with complex phenotypes. Commonly used RV association tests have limited scope to leverage variant functions. We propose STAAR (variant-set test for association using annotation information), a scalable and powerful RV association test method that effectively incorporates both variant categories and multiple complementary annotations using a dynamic weighting scheme. For the latter, we introduce ‘annotation principal components’, multidimensional summaries of in silico variant annotations. STAAR accounts for population structure and relatedness and is scalable for analyzing very large cohort and biobank whole-genome sequencing studies of continuous and dichotomous traits. We applied STAAR to identify RVs associated with four lipid traits in 12,316 discovery and 17,822 replication samples from the Trans-Omics for Precision Medicine Program. We discovered and replicated new RV associations, including disruptive missense RVs of NPC1L1 and an intergenic region near APOC1P1 associated with low-density lipoprotein cholesterol.

Share

COinS