CAPL: A novel association test using case-control and family data and accounting for population stratification

Ren Hua Chung, Mike Schmidt, Richard W. Morris, Eden R Martin

Research output: Contribution to journalArticle

15 Citations (Scopus)

Abstract

The recent successes of GWAS based on large sample sizes motivate combining independent datasets to obtain larger sample sizes and thereby increase statistical power. Analysis methods that can accommodate different study designs, such as family-based and case-control designs, are of general interest. However, population stratification can cause spurious association for population-based association analyses. For family-based association analysis that infers missing parental genotypes based on the allele frequencies estimated in the entire sample, the parental mating-type probabilities may not be correctly estimated in the presence of population stratification. Therefore, any approach to combining family and case-control data should also properly account for population stratification. Although several methods have been proposed to accommodate family-based and case-control data, all have restrictions. Most of them require sampling a homogeneous population, which may not be a reasonable assumption for data from a large consortium. One of the methods, FamCC, can account for population stratification and uses nuclear families with arbitrary number of siblings but requires parental genotype data, which are often unavailable for late-onset diseases. We extended the family-based test, Association in the Presence of Linkage (APL), to combine family and case-control data (CAPL). CAPL can accommodate case-control data and families with multiple affected siblings and missing parents in the presence of population stratification. We used simulations to demonstrate that CAPL is a valid test either in a homogeneous population or in the presence of population stratification. We also showed that CAPL can have more power than other methods that combine family and case-control data.

Original languageEnglish
Pages (from-to)747-755
Number of pages9
JournalGenetic Epidemiology
Volume34
Issue number7
DOIs
StatePublished - Nov 1 2010

Fingerprint

Population
Sample Size
Siblings
Genotype
Genome-Wide Association Study
Nuclear Family
Gene Frequency
Parents

Keywords

  • Association analysis
  • Linkage disequilibrium
  • Population stratification

ASJC Scopus subject areas

  • Genetics(clinical)
  • Epidemiology

Cite this

CAPL : A novel association test using case-control and family data and accounting for population stratification. / Chung, Ren Hua; Schmidt, Mike; Morris, Richard W.; Martin, Eden R.

In: Genetic Epidemiology, Vol. 34, No. 7, 01.11.2010, p. 747-755.

Research output: Contribution to journalArticle

@article{89b5092ece244963b3c01cb20e8173fe,
title = "CAPL: A novel association test using case-control and family data and accounting for population stratification",
abstract = "The recent successes of GWAS based on large sample sizes motivate combining independent datasets to obtain larger sample sizes and thereby increase statistical power. Analysis methods that can accommodate different study designs, such as family-based and case-control designs, are of general interest. However, population stratification can cause spurious association for population-based association analyses. For family-based association analysis that infers missing parental genotypes based on the allele frequencies estimated in the entire sample, the parental mating-type probabilities may not be correctly estimated in the presence of population stratification. Therefore, any approach to combining family and case-control data should also properly account for population stratification. Although several methods have been proposed to accommodate family-based and case-control data, all have restrictions. Most of them require sampling a homogeneous population, which may not be a reasonable assumption for data from a large consortium. One of the methods, FamCC, can account for population stratification and uses nuclear families with arbitrary number of siblings but requires parental genotype data, which are often unavailable for late-onset diseases. We extended the family-based test, Association in the Presence of Linkage (APL), to combine family and case-control data (CAPL). CAPL can accommodate case-control data and families with multiple affected siblings and missing parents in the presence of population stratification. We used simulations to demonstrate that CAPL is a valid test either in a homogeneous population or in the presence of population stratification. We also showed that CAPL can have more power than other methods that combine family and case-control data.",
keywords = "Association analysis, Linkage disequilibrium, Population stratification",
author = "Chung, {Ren Hua} and Mike Schmidt and Morris, {Richard W.} and Martin, {Eden R}",
year = "2010",
month = "11",
day = "1",
doi = "10.1002/gepi.20539",
language = "English",
volume = "34",
pages = "747--755",
journal = "Genetic Epidemiology",
issn = "0741-0395",
publisher = "Wiley-Liss Inc.",
number = "7",

}

TY - JOUR

T1 - CAPL

T2 - A novel association test using case-control and family data and accounting for population stratification

AU - Chung, Ren Hua

AU - Schmidt, Mike

AU - Morris, Richard W.

AU - Martin, Eden R

PY - 2010/11/1

Y1 - 2010/11/1

N2 - The recent successes of GWAS based on large sample sizes motivate combining independent datasets to obtain larger sample sizes and thereby increase statistical power. Analysis methods that can accommodate different study designs, such as family-based and case-control designs, are of general interest. However, population stratification can cause spurious association for population-based association analyses. For family-based association analysis that infers missing parental genotypes based on the allele frequencies estimated in the entire sample, the parental mating-type probabilities may not be correctly estimated in the presence of population stratification. Therefore, any approach to combining family and case-control data should also properly account for population stratification. Although several methods have been proposed to accommodate family-based and case-control data, all have restrictions. Most of them require sampling a homogeneous population, which may not be a reasonable assumption for data from a large consortium. One of the methods, FamCC, can account for population stratification and uses nuclear families with arbitrary number of siblings but requires parental genotype data, which are often unavailable for late-onset diseases. We extended the family-based test, Association in the Presence of Linkage (APL), to combine family and case-control data (CAPL). CAPL can accommodate case-control data and families with multiple affected siblings and missing parents in the presence of population stratification. We used simulations to demonstrate that CAPL is a valid test either in a homogeneous population or in the presence of population stratification. We also showed that CAPL can have more power than other methods that combine family and case-control data.

AB - The recent successes of GWAS based on large sample sizes motivate combining independent datasets to obtain larger sample sizes and thereby increase statistical power. Analysis methods that can accommodate different study designs, such as family-based and case-control designs, are of general interest. However, population stratification can cause spurious association for population-based association analyses. For family-based association analysis that infers missing parental genotypes based on the allele frequencies estimated in the entire sample, the parental mating-type probabilities may not be correctly estimated in the presence of population stratification. Therefore, any approach to combining family and case-control data should also properly account for population stratification. Although several methods have been proposed to accommodate family-based and case-control data, all have restrictions. Most of them require sampling a homogeneous population, which may not be a reasonable assumption for data from a large consortium. One of the methods, FamCC, can account for population stratification and uses nuclear families with arbitrary number of siblings but requires parental genotype data, which are often unavailable for late-onset diseases. We extended the family-based test, Association in the Presence of Linkage (APL), to combine family and case-control data (CAPL). CAPL can accommodate case-control data and families with multiple affected siblings and missing parents in the presence of population stratification. We used simulations to demonstrate that CAPL is a valid test either in a homogeneous population or in the presence of population stratification. We also showed that CAPL can have more power than other methods that combine family and case-control data.

KW - Association analysis

KW - Linkage disequilibrium

KW - Population stratification

UR - http://www.scopus.com/inward/record.url?scp=77958598598&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77958598598&partnerID=8YFLogxK

U2 - 10.1002/gepi.20539

DO - 10.1002/gepi.20539

M3 - Article

C2 - 20878716

AN - SCOPUS:77958598598

VL - 34

SP - 747

EP - 755

JO - Genetic Epidemiology

JF - Genetic Epidemiology

SN - 0741-0395

IS - 7

ER -