Robust meta‐analysis of biobank‐based genome‐wide association studies with unbalanced binary phenotypes

Rounek Dey, University of Michigan
Jonas B. Nielsen, Statens Serum Institut
Lars G. Fritsche, University of Michigan
Wei Zhou, University of Michigan
Huanhuan Zhu, Michigan Technological University
Cristen J. Willer, University of Michigan
Seunggeun Lee, University of Michigan

© 2019 Wiley Periodicals, Inc. Publisher's version of record:


With the availability of large‐scale biobanks, genome‐wide scale phenome‐wide association studies are being instrumental in discovering novel genetic variants associated with clinical phenotypes. As increasing number of such association results from different biobanks become available, methods to meta‐analyse those association results is of great interest. Because the binary phenotypes in biobank‐based studies are mostly unbalanced in their case–control ratios, very few methods can provide well‐calibrated tests for associations. For example, traditional Z‐score‐based meta‐analysis often results in conservative or anticonservative Type I error rates in such unbalanced scenarios. We propose two meta‐analysis strategies that can efficiently combine association results from biobank‐based studies with such unbalanced phenotypes, using the saddlepoint approximation‐based score test method. Our first method involves sharing the overall genotype counts from each study, and the second method involves sharing an approximation of the distribution of the score test statistic from each study using cubic Hermite splines. We compare our proposed methods with a traditional Z‐score‐based meta‐analysis strategy using numerical simulations and real data applications, and demonstrate the superior performance of our proposed methods in terms of Type I error control.