Random survival forests for high-dimensional data

Hemant Ishwaran, Udaya B. Kogalur, Xi Chen, Andy J. Minn

Research output: Contribution to journalArticle

48 Citations (Scopus)

Abstract

Minimal depth is a dimensionless order statistic that measures the predictiveness of a variable in a survival tree. It can be used to select variables in high-dimensional problems using Random Survival Forests (RSF), a new extension of Breiman's Random Forests (RF) to survival settings. We review this methodology and demonstrate its use in high-dimensional survival problems using a public domain R-language package randomSurvivalForest. We discuss effective ways to regularize forests and discuss how to properly tune the RF parameters 'nodesize' and 'mtry'. We also introduce new graphical ways of using minimal depth for exploring variable relationships.

Original languageEnglish
Pages (from-to)115-132
Number of pages18
JournalStatistical Analysis and Data Mining
Volume4
Issue number1
DOIs
StatePublished - Feb 1 2011
Externally publishedYes

Fingerprint

High-dimensional Data
Statistics
Random Forest
High-dimensional
Order Statistics
Dimensionless
Methodology
Demonstrate

Keywords

  • Forests
  • Maximal subtree
  • Minimal depth
  • Trees
  • Variable selection
  • VIMP

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Analysis

Cite this

Random survival forests for high-dimensional data. / Ishwaran, Hemant; Kogalur, Udaya B.; Chen, Xi; Minn, Andy J.

In: Statistical Analysis and Data Mining, Vol. 4, No. 1, 01.02.2011, p. 115-132.

Research output: Contribution to journalArticle

Ishwaran, Hemant ; Kogalur, Udaya B. ; Chen, Xi ; Minn, Andy J. / Random survival forests for high-dimensional data. In: Statistical Analysis and Data Mining. 2011 ; Vol. 4, No. 1. pp. 115-132.
@article{7f4aac8fa46a45c2abd648b1d3f1e277,
title = "Random survival forests for high-dimensional data",
abstract = "Minimal depth is a dimensionless order statistic that measures the predictiveness of a variable in a survival tree. It can be used to select variables in high-dimensional problems using Random Survival Forests (RSF), a new extension of Breiman's Random Forests (RF) to survival settings. We review this methodology and demonstrate its use in high-dimensional survival problems using a public domain R-language package randomSurvivalForest. We discuss effective ways to regularize forests and discuss how to properly tune the RF parameters 'nodesize' and 'mtry'. We also introduce new graphical ways of using minimal depth for exploring variable relationships.",
keywords = "Forests, Maximal subtree, Minimal depth, Trees, Variable selection, VIMP",
author = "Hemant Ishwaran and Kogalur, {Udaya B.} and Xi Chen and Minn, {Andy J.}",
year = "2011",
month = "2",
day = "1",
doi = "10.1002/sam.10103",
language = "English",
volume = "4",
pages = "115--132",
journal = "Statistical Analysis and Data Mining",
issn = "1932-1872",
publisher = "John Wiley and Sons Inc.",
number = "1",

}

TY - JOUR

T1 - Random survival forests for high-dimensional data

AU - Ishwaran, Hemant

AU - Kogalur, Udaya B.

AU - Chen, Xi

AU - Minn, Andy J.

PY - 2011/2/1

Y1 - 2011/2/1

N2 - Minimal depth is a dimensionless order statistic that measures the predictiveness of a variable in a survival tree. It can be used to select variables in high-dimensional problems using Random Survival Forests (RSF), a new extension of Breiman's Random Forests (RF) to survival settings. We review this methodology and demonstrate its use in high-dimensional survival problems using a public domain R-language package randomSurvivalForest. We discuss effective ways to regularize forests and discuss how to properly tune the RF parameters 'nodesize' and 'mtry'. We also introduce new graphical ways of using minimal depth for exploring variable relationships.

AB - Minimal depth is a dimensionless order statistic that measures the predictiveness of a variable in a survival tree. It can be used to select variables in high-dimensional problems using Random Survival Forests (RSF), a new extension of Breiman's Random Forests (RF) to survival settings. We review this methodology and demonstrate its use in high-dimensional survival problems using a public domain R-language package randomSurvivalForest. We discuss effective ways to regularize forests and discuss how to properly tune the RF parameters 'nodesize' and 'mtry'. We also introduce new graphical ways of using minimal depth for exploring variable relationships.

KW - Forests

KW - Maximal subtree

KW - Minimal depth

KW - Trees

KW - Variable selection

KW - VIMP

UR - http://www.scopus.com/inward/record.url?scp=79551713057&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79551713057&partnerID=8YFLogxK

U2 - 10.1002/sam.10103

DO - 10.1002/sam.10103

M3 - Article

AN - SCOPUS:79551713057

VL - 4

SP - 115

EP - 132

JO - Statistical Analysis and Data Mining

JF - Statistical Analysis and Data Mining

SN - 1932-1872

IS - 1

ER -