Random forests for genomic data analysis

Research output: Contribution to journalArticle

210 Citations (Scopus)

Abstract

Random forests (RF) is a popular tree-based ensemble machine learning tool that is highly data adaptive, applies to "large p, small n" problems, and is able to account for correlation as well as interactions among features. This makes RF particularly appealing for high-dimensional genomic data analysis. In this article, we systematically review the applications and recent progresses of RF for genomic data, including prediction and classification, variable selection, pathway analysis, genetic association and epistasis detection, and unsupervised learning.

Original languageEnglish
Pages (from-to)323-329
Number of pages7
JournalGenomics
Volume99
Issue number6
DOIs
StatePublished - Jun 1 2012

Fingerprint

Genetic Epistasis
Learning
Forests
Machine Learning

Keywords

  • Classification
  • Genomic data analysis
  • Prediction
  • Random forests
  • Random survival forests
  • Variable selection

ASJC Scopus subject areas

  • Genetics

Cite this

Random forests for genomic data analysis. / Chen, Xi; Ishwaran, Hemant.

In: Genomics, Vol. 99, No. 6, 01.06.2012, p. 323-329.

Research output: Contribution to journalArticle

@article{2c5aaff590764304bc3f3e5440c40672,
title = "Random forests for genomic data analysis",
abstract = "Random forests (RF) is a popular tree-based ensemble machine learning tool that is highly data adaptive, applies to {"}large p, small n{"} problems, and is able to account for correlation as well as interactions among features. This makes RF particularly appealing for high-dimensional genomic data analysis. In this article, we systematically review the applications and recent progresses of RF for genomic data, including prediction and classification, variable selection, pathway analysis, genetic association and epistasis detection, and unsupervised learning.",
keywords = "Classification, Genomic data analysis, Prediction, Random forests, Random survival forests, Variable selection",
author = "Xi Chen and Hemant Ishwaran",
year = "2012",
month = "6",
day = "1",
doi = "10.1016/j.ygeno.2012.04.003",
language = "English",
volume = "99",
pages = "323--329",
journal = "Genomics",
issn = "0888-7543",
publisher = "Academic Press Inc.",
number = "6",

}

TY - JOUR

T1 - Random forests for genomic data analysis

AU - Chen, Xi

AU - Ishwaran, Hemant

PY - 2012/6/1

Y1 - 2012/6/1

N2 - Random forests (RF) is a popular tree-based ensemble machine learning tool that is highly data adaptive, applies to "large p, small n" problems, and is able to account for correlation as well as interactions among features. This makes RF particularly appealing for high-dimensional genomic data analysis. In this article, we systematically review the applications and recent progresses of RF for genomic data, including prediction and classification, variable selection, pathway analysis, genetic association and epistasis detection, and unsupervised learning.

AB - Random forests (RF) is a popular tree-based ensemble machine learning tool that is highly data adaptive, applies to "large p, small n" problems, and is able to account for correlation as well as interactions among features. This makes RF particularly appealing for high-dimensional genomic data analysis. In this article, we systematically review the applications and recent progresses of RF for genomic data, including prediction and classification, variable selection, pathway analysis, genetic association and epistasis detection, and unsupervised learning.

KW - Classification

KW - Genomic data analysis

KW - Prediction

KW - Random forests

KW - Random survival forests

KW - Variable selection

UR - http://www.scopus.com/inward/record.url?scp=84861730860&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84861730860&partnerID=8YFLogxK

U2 - 10.1016/j.ygeno.2012.04.003

DO - 10.1016/j.ygeno.2012.04.003

M3 - Article

C2 - 22546560

AN - SCOPUS:84861730860

VL - 99

SP - 323

EP - 329

JO - Genomics

JF - Genomics

SN - 0888-7543

IS - 6

ER -