An efficient hierarchical generalized linear mixed model for pathway analysis of genome-wide association studies

Lily Wang, Peilin Jia, Russell D. Wolfinger, Xi Chen, Britney L. Grayson, Thomas M. Aune, Zhongming Zhao

Research output: Contribution to journalArticle

26 Citations (Scopus)

Abstract

Motivation: In genome-wide association studies (GWAS) of complex diseases, genetic variants having real but weak associations often fail to be detected at the stringent genome-wide significance level. Pathway analysis, which tests disease association with combined association signals from a group of variants in the same pathway, has become increasingly popular. However, because of the complexities in genetic data and the large sample sizes in typical GWAS, pathway analysis remains to be challenging. We propose a new statistical model for pathway analysis of GWAS. This model includes a fixed effects component that models mean disease association for a group of genes, and a random effects component that models how each gene's association with disease varies about the gene group mean, thus belongs to the class of mixed effects models. Results: The proposed model is computationally efficient and uses only summary statistics. In addition, it corrects for the presence of overlapping genes and linkage disequilibrium (LD). Via simulated and real GWAS data, we showed our model improved power over currently available pathway analysis methods while preserving type I error rate. Furthermore, using the WTCCC Type 1 Diabetes (T1D) dataset, we demonstrated mixed model analysis identified meaningful biological processes that agreed well with previous reports on T1D. Therefore, the proposed methodology provides an efficient statistical modeling framework for systems analysis of GWAS.

Original languageEnglish (US)
Article numberbtq728
Pages (from-to)686-692
Number of pages7
JournalBioinformatics
Volume27
Issue number5
DOIs
StatePublished - Mar 1 2011
Externally publishedYes

Fingerprint

Generalized Linear Mixed Model
Genome-Wide Association Study
Linear Models
Pathway
Genome
Genes
Type 1 Diabetes Mellitus
Overlapping Genes
Gene
Biological Phenomena
Inborn Genetic Diseases
Diabetes
Linkage Disequilibrium
Statistical Models
Component Model
Medical problems
Systems Analysis
Sample Size
Fixed Effects Model
Mixed Effects Model

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Cite this

An efficient hierarchical generalized linear mixed model for pathway analysis of genome-wide association studies. / Wang, Lily; Jia, Peilin; Wolfinger, Russell D.; Chen, Xi; Grayson, Britney L.; Aune, Thomas M.; Zhao, Zhongming.

In: Bioinformatics, Vol. 27, No. 5, btq728, 01.03.2011, p. 686-692.

Research output: Contribution to journalArticle

Wang, Lily ; Jia, Peilin ; Wolfinger, Russell D. ; Chen, Xi ; Grayson, Britney L. ; Aune, Thomas M. ; Zhao, Zhongming. / An efficient hierarchical generalized linear mixed model for pathway analysis of genome-wide association studies. In: Bioinformatics. 2011 ; Vol. 27, No. 5. pp. 686-692.
@article{43800a1bfa9d4366809c0ed52cb134a3,
title = "An efficient hierarchical generalized linear mixed model for pathway analysis of genome-wide association studies",
abstract = "Motivation: In genome-wide association studies (GWAS) of complex diseases, genetic variants having real but weak associations often fail to be detected at the stringent genome-wide significance level. Pathway analysis, which tests disease association with combined association signals from a group of variants in the same pathway, has become increasingly popular. However, because of the complexities in genetic data and the large sample sizes in typical GWAS, pathway analysis remains to be challenging. We propose a new statistical model for pathway analysis of GWAS. This model includes a fixed effects component that models mean disease association for a group of genes, and a random effects component that models how each gene's association with disease varies about the gene group mean, thus belongs to the class of mixed effects models. Results: The proposed model is computationally efficient and uses only summary statistics. In addition, it corrects for the presence of overlapping genes and linkage disequilibrium (LD). Via simulated and real GWAS data, we showed our model improved power over currently available pathway analysis methods while preserving type I error rate. Furthermore, using the WTCCC Type 1 Diabetes (T1D) dataset, we demonstrated mixed model analysis identified meaningful biological processes that agreed well with previous reports on T1D. Therefore, the proposed methodology provides an efficient statistical modeling framework for systems analysis of GWAS.",
author = "Lily Wang and Peilin Jia and Wolfinger, {Russell D.} and Xi Chen and Grayson, {Britney L.} and Aune, {Thomas M.} and Zhongming Zhao",
year = "2011",
month = "3",
day = "1",
doi = "10.1093/bioinformatics/btq728",
language = "English (US)",
volume = "27",
pages = "686--692",
journal = "Bioinformatics (Oxford, England)",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "5",

}

TY - JOUR

T1 - An efficient hierarchical generalized linear mixed model for pathway analysis of genome-wide association studies

AU - Wang, Lily

AU - Jia, Peilin

AU - Wolfinger, Russell D.

AU - Chen, Xi

AU - Grayson, Britney L.

AU - Aune, Thomas M.

AU - Zhao, Zhongming

PY - 2011/3/1

Y1 - 2011/3/1

N2 - Motivation: In genome-wide association studies (GWAS) of complex diseases, genetic variants having real but weak associations often fail to be detected at the stringent genome-wide significance level. Pathway analysis, which tests disease association with combined association signals from a group of variants in the same pathway, has become increasingly popular. However, because of the complexities in genetic data and the large sample sizes in typical GWAS, pathway analysis remains to be challenging. We propose a new statistical model for pathway analysis of GWAS. This model includes a fixed effects component that models mean disease association for a group of genes, and a random effects component that models how each gene's association with disease varies about the gene group mean, thus belongs to the class of mixed effects models. Results: The proposed model is computationally efficient and uses only summary statistics. In addition, it corrects for the presence of overlapping genes and linkage disequilibrium (LD). Via simulated and real GWAS data, we showed our model improved power over currently available pathway analysis methods while preserving type I error rate. Furthermore, using the WTCCC Type 1 Diabetes (T1D) dataset, we demonstrated mixed model analysis identified meaningful biological processes that agreed well with previous reports on T1D. Therefore, the proposed methodology provides an efficient statistical modeling framework for systems analysis of GWAS.

AB - Motivation: In genome-wide association studies (GWAS) of complex diseases, genetic variants having real but weak associations often fail to be detected at the stringent genome-wide significance level. Pathway analysis, which tests disease association with combined association signals from a group of variants in the same pathway, has become increasingly popular. However, because of the complexities in genetic data and the large sample sizes in typical GWAS, pathway analysis remains to be challenging. We propose a new statistical model for pathway analysis of GWAS. This model includes a fixed effects component that models mean disease association for a group of genes, and a random effects component that models how each gene's association with disease varies about the gene group mean, thus belongs to the class of mixed effects models. Results: The proposed model is computationally efficient and uses only summary statistics. In addition, it corrects for the presence of overlapping genes and linkage disequilibrium (LD). Via simulated and real GWAS data, we showed our model improved power over currently available pathway analysis methods while preserving type I error rate. Furthermore, using the WTCCC Type 1 Diabetes (T1D) dataset, we demonstrated mixed model analysis identified meaningful biological processes that agreed well with previous reports on T1D. Therefore, the proposed methodology provides an efficient statistical modeling framework for systems analysis of GWAS.

UR - http://www.scopus.com/inward/record.url?scp=79951966907&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79951966907&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btq728

DO - 10.1093/bioinformatics/btq728

M3 - Article

C2 - 21266443

AN - SCOPUS:79951966907

VL - 27

SP - 686

EP - 692

JO - Bioinformatics (Oxford, England)

JF - Bioinformatics (Oxford, England)

SN - 1367-4803

IS - 5

M1 - btq728

ER -