Random forests for genomic data analysis

Research output: Contribution to journalReview articlepeer-review

304 Scopus citations


Random forests (RF) is a popular tree-based ensemble machine learning tool that is highly data adaptive, applies to "large p, small n" problems, and is able to account for correlation as well as interactions among features. This makes RF particularly appealing for high-dimensional genomic data analysis. In this article, we systematically review the applications and recent progresses of RF for genomic data, including prediction and classification, variable selection, pathway analysis, genetic association and epistasis detection, and unsupervised learning.

Original languageEnglish (US)
Pages (from-to)323-329
Number of pages7
Issue number6
StatePublished - Jun 2012


  • Classification
  • Genomic data analysis
  • Prediction
  • Random forests
  • Random survival forests
  • Variable selection

ASJC Scopus subject areas

  • Genetics


Dive into the research topics of 'Random forests for genomic data analysis'. Together they form a unique fingerprint.

Cite this