Random forests for genomic data analysis

Research output: Contribution to journalReview article

246 Scopus citations

Abstract

Random forests (RF) is a popular tree-based ensemble machine learning tool that is highly data adaptive, applies to "large p, small n" problems, and is able to account for correlation as well as interactions among features. This makes RF particularly appealing for high-dimensional genomic data analysis. In this article, we systematically review the applications and recent progresses of RF for genomic data, including prediction and classification, variable selection, pathway analysis, genetic association and epistasis detection, and unsupervised learning.

Original languageEnglish (US)
Pages (from-to)323-329
Number of pages7
JournalGenomics
Volume99
Issue number6
DOIs
StatePublished - Jun 1 2012

Keywords

  • Classification
  • Genomic data analysis
  • Prediction
  • Random forests
  • Random survival forests
  • Variable selection

ASJC Scopus subject areas

  • Genetics

Fingerprint Dive into the research topics of 'Random forests for genomic data analysis'. Together they form a unique fingerprint.

  • Cite this