Random forests for genomic data analysis

Research output: Contribution to journalReview article

246 Scopus citations


Random forests (RF) is a popular tree-based ensemble machine learning tool that is highly data adaptive, applies to "large p, small n" problems, and is able to account for correlation as well as interactions among features. This makes RF particularly appealing for high-dimensional genomic data analysis. In this article, we systematically review the applications and recent progresses of RF for genomic data, including prediction and classification, variable selection, pathway analysis, genetic association and epistasis detection, and unsupervised learning.

Original languageEnglish (US)
Pages (from-to)323-329
Number of pages7
Issue number6
StatePublished - Jun 1 2012


  • Classification
  • Genomic data analysis
  • Prediction
  • Random forests
  • Random survival forests
  • Variable selection

ASJC Scopus subject areas

  • Genetics

Fingerprint Dive into the research topics of 'Random forests for genomic data analysis'. Together they form a unique fingerprint.

  • Cite this