Ensemble survival tree models to reveal pairwise interactions of variables with time-to-events outcomes in low-dimensional setting

Jean Eudes Dazard, Hemant Ishwaran, Rajeev Mehlotra, Aaron Weinberg, Peter Zimmerman

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Unraveling interactions among variables such as genetic, clinical, demographic and environmental factors is essential to understand the development of common and complex diseases. To increase the power to detect such variables interactions associated with clinical time-to-events outcomes, we borrowed established concepts from random survival forest (RSF) models. We introduce a novel RSF-based pairwise interaction estimator and derive a randomization method with bootstrap confidence intervals for inferring interaction significance. Using various linear and nonlinear time-to-events survival models in simulation studies, we first show the efficiency of our approach: true pairwise interaction-effects between variables are uncovered, while they may not be accompanied with their corresponding main-effects, and may not be detected by standard semi-parametric regression modeling and test statistics used in survival analysis. Moreover, using a RSF-based cross-validation scheme for generating prediction estimators, we show that informative predictors may be inferred. We applied our approach to an HIV cohort study recording key host gene polymorphisms and their association with HIV change of tropism or AIDS progression. Altogether, this shows how linear or nonlinear pairwise statistical interactions of variables may be efficiently detected with a predictive value in observational studies with time-to-event outcomes.

Original languageEnglish (US)
Article number20170038
JournalStatistical Applications in Genetics and Molecular Biology
Volume17
Issue number1
DOIs
StatePublished - Feb 23 2018

Fingerprint

Pairwise
Ensemble
Polymorphism
Interaction
HIV
Tropism
Genes
Statistics
Survival Analysis
Random Allocation
Observational Studies
Acquired Immunodeficiency Syndrome
Cohort Studies
Semiparametric Regression
Bootstrap Confidence Intervals
Estimator
Cohort Study
Survival Model
Observational Study
Demography

Keywords

  • epistasis
  • genetic variations interactions
  • interaction detection and modeling
  • random survival forest
  • time-to-event analysis

ASJC Scopus subject areas

  • Statistics and Probability
  • Molecular Biology
  • Genetics
  • Computational Mathematics

Cite this

Ensemble survival tree models to reveal pairwise interactions of variables with time-to-events outcomes in low-dimensional setting. / Dazard, Jean Eudes; Ishwaran, Hemant; Mehlotra, Rajeev; Weinberg, Aaron; Zimmerman, Peter.

In: Statistical Applications in Genetics and Molecular Biology, Vol. 17, No. 1, 20170038, 23.02.2018.

Research output: Contribution to journalArticle

@article{8a2c73af355b4258b84176e40c4e186f,
title = "Ensemble survival tree models to reveal pairwise interactions of variables with time-to-events outcomes in low-dimensional setting",
abstract = "Unraveling interactions among variables such as genetic, clinical, demographic and environmental factors is essential to understand the development of common and complex diseases. To increase the power to detect such variables interactions associated with clinical time-to-events outcomes, we borrowed established concepts from random survival forest (RSF) models. We introduce a novel RSF-based pairwise interaction estimator and derive a randomization method with bootstrap confidence intervals for inferring interaction significance. Using various linear and nonlinear time-to-events survival models in simulation studies, we first show the efficiency of our approach: true pairwise interaction-effects between variables are uncovered, while they may not be accompanied with their corresponding main-effects, and may not be detected by standard semi-parametric regression modeling and test statistics used in survival analysis. Moreover, using a RSF-based cross-validation scheme for generating prediction estimators, we show that informative predictors may be inferred. We applied our approach to an HIV cohort study recording key host gene polymorphisms and their association with HIV change of tropism or AIDS progression. Altogether, this shows how linear or nonlinear pairwise statistical interactions of variables may be efficiently detected with a predictive value in observational studies with time-to-event outcomes.",
keywords = "epistasis, genetic variations interactions, interaction detection and modeling, random survival forest, time-to-event analysis",
author = "Dazard, {Jean Eudes} and Hemant Ishwaran and Rajeev Mehlotra and Aaron Weinberg and Peter Zimmerman",
year = "2018",
month = "2",
day = "23",
doi = "10.1515/sagmb-2017-0038",
language = "English (US)",
volume = "17",
journal = "Statistical Applications in Genetics and Molecular Biology",
issn = "1544-6115",
publisher = "Berkeley Electronic Press",
number = "1",

}

TY - JOUR

T1 - Ensemble survival tree models to reveal pairwise interactions of variables with time-to-events outcomes in low-dimensional setting

AU - Dazard, Jean Eudes

AU - Ishwaran, Hemant

AU - Mehlotra, Rajeev

AU - Weinberg, Aaron

AU - Zimmerman, Peter

PY - 2018/2/23

Y1 - 2018/2/23

N2 - Unraveling interactions among variables such as genetic, clinical, demographic and environmental factors is essential to understand the development of common and complex diseases. To increase the power to detect such variables interactions associated with clinical time-to-events outcomes, we borrowed established concepts from random survival forest (RSF) models. We introduce a novel RSF-based pairwise interaction estimator and derive a randomization method with bootstrap confidence intervals for inferring interaction significance. Using various linear and nonlinear time-to-events survival models in simulation studies, we first show the efficiency of our approach: true pairwise interaction-effects between variables are uncovered, while they may not be accompanied with their corresponding main-effects, and may not be detected by standard semi-parametric regression modeling and test statistics used in survival analysis. Moreover, using a RSF-based cross-validation scheme for generating prediction estimators, we show that informative predictors may be inferred. We applied our approach to an HIV cohort study recording key host gene polymorphisms and their association with HIV change of tropism or AIDS progression. Altogether, this shows how linear or nonlinear pairwise statistical interactions of variables may be efficiently detected with a predictive value in observational studies with time-to-event outcomes.

AB - Unraveling interactions among variables such as genetic, clinical, demographic and environmental factors is essential to understand the development of common and complex diseases. To increase the power to detect such variables interactions associated with clinical time-to-events outcomes, we borrowed established concepts from random survival forest (RSF) models. We introduce a novel RSF-based pairwise interaction estimator and derive a randomization method with bootstrap confidence intervals for inferring interaction significance. Using various linear and nonlinear time-to-events survival models in simulation studies, we first show the efficiency of our approach: true pairwise interaction-effects between variables are uncovered, while they may not be accompanied with their corresponding main-effects, and may not be detected by standard semi-parametric regression modeling and test statistics used in survival analysis. Moreover, using a RSF-based cross-validation scheme for generating prediction estimators, we show that informative predictors may be inferred. We applied our approach to an HIV cohort study recording key host gene polymorphisms and their association with HIV change of tropism or AIDS progression. Altogether, this shows how linear or nonlinear pairwise statistical interactions of variables may be efficiently detected with a predictive value in observational studies with time-to-event outcomes.

KW - epistasis

KW - genetic variations interactions

KW - interaction detection and modeling

KW - random survival forest

KW - time-to-event analysis

UR - http://www.scopus.com/inward/record.url?scp=85042687354&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85042687354&partnerID=8YFLogxK

U2 - 10.1515/sagmb-2017-0038

DO - 10.1515/sagmb-2017-0038

M3 - Article

C2 - 29453930

AN - SCOPUS:85042687354

VL - 17

JO - Statistical Applications in Genetics and Molecular Biology

JF - Statistical Applications in Genetics and Molecular Biology

SN - 1544-6115

IS - 1

M1 - 20170038

ER -