Pathway-based analysis for genome-wide association studies using supervised principal components

Xi Chen, Lily Wang, Bo Hu, Mingsheng Guo, John Barnard, Xiaofeng Zhu

Research output: Contribution to journalArticle

36 Scopus citations

Abstract

Many complex diseases are influenced by genetic variations in multiple genes, each with only a small marginal effect on disease susceptibility. Pathway analysis, which identifies biological pathways associated with disease outcome, has become increasingly popular for genome-wide association studies (GWAS). In addition to combining weak signals from a number of SNPs in the same pathway, results from pathway analysis also shed light on the biological processes underlying disease. We propose a new pathway-based analysis method for GWAS, the supervised principal component analysis (SPCA) model. In the proposed SPCA model, a selected subset of SNPs most associated with disease outcome is used to estimate the latent variable for a pathway. The estimated latent variable for each pathway is an optimal linear combination of a selected subset of SNPs; therefore, the proposed SPCA model provides the ability to borrow strength across the SNPs in a pathway. In addition to identifying pathways associated with disease outcome, SPCA also carries out additional within-category selection to identify the most important SNPs within each gene set. The proposed model operates in a well-established statistical framework and can handle design information such as covariate adjustment and matching information in GWAS. We compare the proposed method with currently available methods using data with realistic linkage disequilibrium structures, and we illustrate the SPCA method using the Wellcome Trust Case-Control Consortium Crohn Disease (CD) data set.

Original languageEnglish (US)
Pages (from-to)716-724
Number of pages9
JournalGenetic Epidemiology
Volume34
Issue number7
DOIs
StatePublished - Nov 1 2010
Externally publishedYes

    Fingerprint

Keywords

  • Genome-wide association
  • Pathway analysis
  • Principal component analysis
  • SNPs

ASJC Scopus subject areas

  • Epidemiology
  • Genetics(clinical)

Cite this