A data-driven weighting scheme for family-based genome-wide association studies

Document Type


Publication Date



Recently, Steen et al proposed a novel two-stage approach for family-based genome-wide association studies. In the first stage, a test based on between-family information is used to rank SNPs according to their P-values or conditional power of the test. In the second stage, the R most promising SNPs are tested using a family-based association test. We call this two-stage approach top R method. Ionita-Laza et al proposed an exponential weighting method within a two-stage framework. In the second stage of this approach, instead of testing top R SNPs, it tests all SNPs and weights the P-values of association test according to the information of the first stage. However, both of the top R and exponential weighting methods only use the information from the first stage to rank SNPs. It seems that the two methods do not use information from the first stage efficiently. Furthermore, it may be unreasonable for the exponential weighting method to use the same weight for all SNPs within a group when only one or a few SNPs are related with a disease. In this article, we propose a data-driven weighting scheme within a two-stage framework. In this method, we use the information from the first stage to determine a SNP-specific weight for each SNP. We use simulation studies to evaluate the performance of our method. The simulation results showed that our proposed method is consistently more powerful than the top R method and the exponential weighting method, regardless of the LD structure, population structure, and family structure. © 2010 Macmillan Publishers Limited All rights reserved.

Publication Title

European Journal of Human Genetics