Prediction weighted maximum frequency selection

Hongmei Liu, Jonnagadda S Rao

Research output: Contribution to journalArticle

Abstract

Shrinkage estimators that possess the ability to produce sparse solutions have become increasingly important to the analysis of today’s complex datasets. Examples include the LASSO, the Elastic-Net and their adaptive counterparts. Estimation of penalty parameters still presents difficulties however. While variable selection consistent procedures have been developed, their finite sample performance can often be less than satisfactory. We develop a new strategy for variable selection using the adaptive LASSO and adaptive Elastic-Net estimators with pn diverging. The basic idea first involves using the trace paths of their LARS solutions to bootstrap estimates of maximum frequency (MF) models conditioned on dimension. Conditioning on dimension effectively mitigates overfitting, however to deal with underfitting, these MFs are then prediction-weighted, and it is shown that not only can consistent model selection be achieved, but that attractive convergence rates can as well, leading to excellent finite sample performance. Detailed numerical studies are carried out on both simulated and real datasets. Extensions to the class of generalized linear models are also detailed. MSC 2010 subject classifications: Primary 62J07.

Original languageEnglish (US)
Pages (from-to)640-681
Number of pages42
JournalElectronic Journal of Statistics
Volume11
Issue number1
DOIs
StatePublished - 2017

Fingerprint

Elastic Net
Variable Selection
Adaptive Lasso
Shrinkage Estimator
Prediction
Overfitting
Generalized Linear Model
Model Selection
Conditioning
Bootstrap
Penalty
Convergence Rate
Numerical Study
Trace
Estimator
Path
Estimate
Model
Finite sample
Variable selection

Keywords

  • Adaptive Elastic-Net
  • Adaptive LASSO
  • Bootstrapping
  • Model selection

ASJC Scopus subject areas

  • Statistics and Probability

Cite this

Prediction weighted maximum frequency selection. / Liu, Hongmei; Rao, Jonnagadda S.

In: Electronic Journal of Statistics, Vol. 11, No. 1, 2017, p. 640-681.

Research output: Contribution to journalArticle

@article{96c351f10dd544f9a53e2324e26c6b44,
title = "Prediction weighted maximum frequency selection",
abstract = "Shrinkage estimators that possess the ability to produce sparse solutions have become increasingly important to the analysis of today’s complex datasets. Examples include the LASSO, the Elastic-Net and their adaptive counterparts. Estimation of penalty parameters still presents difficulties however. While variable selection consistent procedures have been developed, their finite sample performance can often be less than satisfactory. We develop a new strategy for variable selection using the adaptive LASSO and adaptive Elastic-Net estimators with pn diverging. The basic idea first involves using the trace paths of their LARS solutions to bootstrap estimates of maximum frequency (MF) models conditioned on dimension. Conditioning on dimension effectively mitigates overfitting, however to deal with underfitting, these MFs are then prediction-weighted, and it is shown that not only can consistent model selection be achieved, but that attractive convergence rates can as well, leading to excellent finite sample performance. Detailed numerical studies are carried out on both simulated and real datasets. Extensions to the class of generalized linear models are also detailed. MSC 2010 subject classifications: Primary 62J07.",
keywords = "Adaptive Elastic-Net, Adaptive LASSO, Bootstrapping, Model selection",
author = "Hongmei Liu and Rao, {Jonnagadda S}",
year = "2017",
doi = "10.1214/17-EJS1240",
language = "English (US)",
volume = "11",
pages = "640--681",
journal = "Electronic Journal of Statistics",
issn = "1935-7524",
publisher = "Institute of Mathematical Statistics",
number = "1",

}

TY - JOUR

T1 - Prediction weighted maximum frequency selection

AU - Liu, Hongmei

AU - Rao, Jonnagadda S

PY - 2017

Y1 - 2017

N2 - Shrinkage estimators that possess the ability to produce sparse solutions have become increasingly important to the analysis of today’s complex datasets. Examples include the LASSO, the Elastic-Net and their adaptive counterparts. Estimation of penalty parameters still presents difficulties however. While variable selection consistent procedures have been developed, their finite sample performance can often be less than satisfactory. We develop a new strategy for variable selection using the adaptive LASSO and adaptive Elastic-Net estimators with pn diverging. The basic idea first involves using the trace paths of their LARS solutions to bootstrap estimates of maximum frequency (MF) models conditioned on dimension. Conditioning on dimension effectively mitigates overfitting, however to deal with underfitting, these MFs are then prediction-weighted, and it is shown that not only can consistent model selection be achieved, but that attractive convergence rates can as well, leading to excellent finite sample performance. Detailed numerical studies are carried out on both simulated and real datasets. Extensions to the class of generalized linear models are also detailed. MSC 2010 subject classifications: Primary 62J07.

AB - Shrinkage estimators that possess the ability to produce sparse solutions have become increasingly important to the analysis of today’s complex datasets. Examples include the LASSO, the Elastic-Net and their adaptive counterparts. Estimation of penalty parameters still presents difficulties however. While variable selection consistent procedures have been developed, their finite sample performance can often be less than satisfactory. We develop a new strategy for variable selection using the adaptive LASSO and adaptive Elastic-Net estimators with pn diverging. The basic idea first involves using the trace paths of their LARS solutions to bootstrap estimates of maximum frequency (MF) models conditioned on dimension. Conditioning on dimension effectively mitigates overfitting, however to deal with underfitting, these MFs are then prediction-weighted, and it is shown that not only can consistent model selection be achieved, but that attractive convergence rates can as well, leading to excellent finite sample performance. Detailed numerical studies are carried out on both simulated and real datasets. Extensions to the class of generalized linear models are also detailed. MSC 2010 subject classifications: Primary 62J07.

KW - Adaptive Elastic-Net

KW - Adaptive LASSO

KW - Bootstrapping

KW - Model selection

UR - http://www.scopus.com/inward/record.url?scp=85014768531&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85014768531&partnerID=8YFLogxK

U2 - 10.1214/17-EJS1240

DO - 10.1214/17-EJS1240

M3 - Article

AN - SCOPUS:85014768531

VL - 11

SP - 640

EP - 681

JO - Electronic Journal of Statistics

JF - Electronic Journal of Statistics

SN - 1935-7524

IS - 1

ER -