Bootstrap choice of cost complexity for better subset selection

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

Subset selection is a long-standing problem. One goal of a selection procedure is consistency. Consistency using Akaike's Final Prediction Error Criterion (FPE) as a selection procedure can be shown to be related to the cost complexity parameter in FPE. However, another goal of a selection procedure is accurate predictions. The consistency property does not necessarily guarantee this second objective. The issue can be thought of as a bias versus variance tradeoff for the procedure. We use the bootstrap to model this tradeoff and provide an objective way of choosing a procedure which attempts to balance the two objectives. This is done in the spirit of the cost complexity pruning algorithm of classification and regression trees. The methodology is described and illustrated on simulated and real data examples.

Original languageEnglish
Pages (from-to)273-287
Number of pages15
JournalStatistica Sinica
Volume9
Issue number1
StatePublished - Jan 1 1999
Externally publishedYes

Fingerprint

Subset Selection
Selection Procedures
Bootstrap
Prediction Error
Costs
Trade-offs
Classification and Regression Trees
Pruning
Methodology
Prediction
Model

Keywords

  • Adaptive estimation
  • Mallow's C
  • Model selection
  • Prediction error
  • Resampling methods

ASJC Scopus subject areas

  • Mathematics(all)
  • Statistics and Probability

Cite this

Bootstrap choice of cost complexity for better subset selection. / Rao, Jonnagadda S.

In: Statistica Sinica, Vol. 9, No. 1, 01.01.1999, p. 273-287.

Research output: Contribution to journalArticle

@article{d9fa66fb85ec47a189e1533e925338a7,
title = "Bootstrap choice of cost complexity for better subset selection",
abstract = "Subset selection is a long-standing problem. One goal of a selection procedure is consistency. Consistency using Akaike's Final Prediction Error Criterion (FPE) as a selection procedure can be shown to be related to the cost complexity parameter in FPE. However, another goal of a selection procedure is accurate predictions. The consistency property does not necessarily guarantee this second objective. The issue can be thought of as a bias versus variance tradeoff for the procedure. We use the bootstrap to model this tradeoff and provide an objective way of choosing a procedure which attempts to balance the two objectives. This is done in the spirit of the cost complexity pruning algorithm of classification and regression trees. The methodology is described and illustrated on simulated and real data examples.",
keywords = "Adaptive estimation, Mallow's C, Model selection, Prediction error, Resampling methods",
author = "Rao, {Jonnagadda S}",
year = "1999",
month = "1",
day = "1",
language = "English",
volume = "9",
pages = "273--287",
journal = "Statistica Sinica",
issn = "1017-0405",
publisher = "Institute of Statistical Science",
number = "1",

}

TY - JOUR

T1 - Bootstrap choice of cost complexity for better subset selection

AU - Rao, Jonnagadda S

PY - 1999/1/1

Y1 - 1999/1/1

N2 - Subset selection is a long-standing problem. One goal of a selection procedure is consistency. Consistency using Akaike's Final Prediction Error Criterion (FPE) as a selection procedure can be shown to be related to the cost complexity parameter in FPE. However, another goal of a selection procedure is accurate predictions. The consistency property does not necessarily guarantee this second objective. The issue can be thought of as a bias versus variance tradeoff for the procedure. We use the bootstrap to model this tradeoff and provide an objective way of choosing a procedure which attempts to balance the two objectives. This is done in the spirit of the cost complexity pruning algorithm of classification and regression trees. The methodology is described and illustrated on simulated and real data examples.

AB - Subset selection is a long-standing problem. One goal of a selection procedure is consistency. Consistency using Akaike's Final Prediction Error Criterion (FPE) as a selection procedure can be shown to be related to the cost complexity parameter in FPE. However, another goal of a selection procedure is accurate predictions. The consistency property does not necessarily guarantee this second objective. The issue can be thought of as a bias versus variance tradeoff for the procedure. We use the bootstrap to model this tradeoff and provide an objective way of choosing a procedure which attempts to balance the two objectives. This is done in the spirit of the cost complexity pruning algorithm of classification and regression trees. The methodology is described and illustrated on simulated and real data examples.

KW - Adaptive estimation

KW - Mallow's C

KW - Model selection

KW - Prediction error

KW - Resampling methods

UR - http://www.scopus.com/inward/record.url?scp=0033441210&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0033441210&partnerID=8YFLogxK

M3 - Article

VL - 9

SP - 273

EP - 287

JO - Statistica Sinica

JF - Statistica Sinica

SN - 1017-0405

IS - 1

ER -