Characterizing l2boosting

John Ehrlinger, Hemant Ishwaran

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

We consider L2Boosting, a special case of Friedman's generic boosting algorithm applied to linear regression under L2-loss. We study L2Boosting for an arbitrary regularization parameter and derive an exact closed form expression for the number of steps taken along a fixed coordinate direction. This relationship is used to describe L2Boosting's solution path, to describe new tools for studying its path, and to characterize some of the algorithm's unique properties, including active set cycling, a property where the algorithm spends lengthy periods of time cycling between the same coordinates when the regularization parameter is arbitrarily small. Our fixed descent analysis also reveals a repressible condition that limits the effectiveness of L2Boosting in correlated problems by preventing desirable variables from entering the solution path. As a simple remedy, a data augmentation method similar to that used for the elastic net is used to introduce L2-penalization and is shown, in combination with decorrelation, to reverse the repressible condition and circumvents L2Boosting's deficiencies in correlated problems. In itself, this presents a new explanation for why the elastic net is successful in correlated problems and why methods like LAR and lasso can perform poorly in such settings.

Original languageEnglish
Pages (from-to)1074-1101
Number of pages28
JournalAnnals of Statistics
Volume40
Issue number2
DOIs
StatePublished - Apr 1 2012

Fingerprint

Elastic Net
Cycling
Regularization Parameter
Path
Active Set
Data Augmentation
Lasso
Penalization
Boosting
Descent
Period of time
Linear regression
Reverse
Closed-form
Arbitrary
Regularization
Relationships
Remedies
Data augmentation

Keywords

  • Critical direction
  • Gradient-correlation
  • Regularization
  • Repressibility
  • Solution path

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Cite this

Characterizing l2boosting. / Ehrlinger, John; Ishwaran, Hemant.

In: Annals of Statistics, Vol. 40, No. 2, 01.04.2012, p. 1074-1101.

Research output: Contribution to journalArticle

Ehrlinger, John ; Ishwaran, Hemant. / Characterizing l2boosting. In: Annals of Statistics. 2012 ; Vol. 40, No. 2. pp. 1074-1101.
@article{3033a344c9104ae48eb3e28e15d510f9,
title = "Characterizing l2boosting",
abstract = "We consider L2Boosting, a special case of Friedman's generic boosting algorithm applied to linear regression under L2-loss. We study L2Boosting for an arbitrary regularization parameter and derive an exact closed form expression for the number of steps taken along a fixed coordinate direction. This relationship is used to describe L2Boosting's solution path, to describe new tools for studying its path, and to characterize some of the algorithm's unique properties, including active set cycling, a property where the algorithm spends lengthy periods of time cycling between the same coordinates when the regularization parameter is arbitrarily small. Our fixed descent analysis also reveals a repressible condition that limits the effectiveness of L2Boosting in correlated problems by preventing desirable variables from entering the solution path. As a simple remedy, a data augmentation method similar to that used for the elastic net is used to introduce L2-penalization and is shown, in combination with decorrelation, to reverse the repressible condition and circumvents L2Boosting's deficiencies in correlated problems. In itself, this presents a new explanation for why the elastic net is successful in correlated problems and why methods like LAR and lasso can perform poorly in such settings.",
keywords = "Critical direction, Gradient-correlation, Regularization, Repressibility, Solution path",
author = "John Ehrlinger and Hemant Ishwaran",
year = "2012",
month = "4",
day = "1",
doi = "10.1214/12-AOS997",
language = "English",
volume = "40",
pages = "1074--1101",
journal = "Annals of Statistics",
issn = "0090-5364",
publisher = "Institute of Mathematical Statistics",
number = "2",

}

TY - JOUR

T1 - Characterizing l2boosting

AU - Ehrlinger, John

AU - Ishwaran, Hemant

PY - 2012/4/1

Y1 - 2012/4/1

N2 - We consider L2Boosting, a special case of Friedman's generic boosting algorithm applied to linear regression under L2-loss. We study L2Boosting for an arbitrary regularization parameter and derive an exact closed form expression for the number of steps taken along a fixed coordinate direction. This relationship is used to describe L2Boosting's solution path, to describe new tools for studying its path, and to characterize some of the algorithm's unique properties, including active set cycling, a property where the algorithm spends lengthy periods of time cycling between the same coordinates when the regularization parameter is arbitrarily small. Our fixed descent analysis also reveals a repressible condition that limits the effectiveness of L2Boosting in correlated problems by preventing desirable variables from entering the solution path. As a simple remedy, a data augmentation method similar to that used for the elastic net is used to introduce L2-penalization and is shown, in combination with decorrelation, to reverse the repressible condition and circumvents L2Boosting's deficiencies in correlated problems. In itself, this presents a new explanation for why the elastic net is successful in correlated problems and why methods like LAR and lasso can perform poorly in such settings.

AB - We consider L2Boosting, a special case of Friedman's generic boosting algorithm applied to linear regression under L2-loss. We study L2Boosting for an arbitrary regularization parameter and derive an exact closed form expression for the number of steps taken along a fixed coordinate direction. This relationship is used to describe L2Boosting's solution path, to describe new tools for studying its path, and to characterize some of the algorithm's unique properties, including active set cycling, a property where the algorithm spends lengthy periods of time cycling between the same coordinates when the regularization parameter is arbitrarily small. Our fixed descent analysis also reveals a repressible condition that limits the effectiveness of L2Boosting in correlated problems by preventing desirable variables from entering the solution path. As a simple remedy, a data augmentation method similar to that used for the elastic net is used to introduce L2-penalization and is shown, in combination with decorrelation, to reverse the repressible condition and circumvents L2Boosting's deficiencies in correlated problems. In itself, this presents a new explanation for why the elastic net is successful in correlated problems and why methods like LAR and lasso can perform poorly in such settings.

KW - Critical direction

KW - Gradient-correlation

KW - Regularization

KW - Repressibility

KW - Solution path

UR - http://www.scopus.com/inward/record.url?scp=84872007834&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84872007834&partnerID=8YFLogxK

U2 - 10.1214/12-AOS997

DO - 10.1214/12-AOS997

M3 - Article

AN - SCOPUS:84872007834

VL - 40

SP - 1074

EP - 1101

JO - Annals of Statistics

JF - Annals of Statistics

SN - 0090-5364

IS - 2

ER -