Local sparse bump hunting

Jean Eudes Dazard, J. Sunil Rao

Research output: Contribution to journalArticlepeer-review

11 Scopus citations

Abstract

The search for structures in real datasets, for example, in the form of bumps, components, classes, or clusters, is important as these often reveal underlying phenomena leading to scientific discoveries. One of these tasks, known as bump hunting, is to locate domains of a multidimensional input space where the target function assumes local maxima without prespecifying their total number. A number of related methods already exist, yet are challenged in the context of high-dimensional data. We introduce a novel supervised and multivariate bump hunting strategy for exploring modes or classes of a target function of many continuous variables. This addresses the issues of correlation, interpretability, and high-dimensionality (p ≤case), while making minimal assumptions. The method is based upon a divide and conquer strategy, combining a treebased method, a dimension reduction technique, and the Patient Rule InductionMethod (PRIM). Important to this task, we show how to estimate the PRIM meta-parameters. Using accuracy evaluation procedures such as cross-validation and ROC analysis, we show empirically how the method outperforms a naive PRIM as well as competitive nonparametric supervised and unsupervised methods in the problem of class discovery. The method has practical application especially in the case of noisy high-throughput data. It is applied to a class discovery problem in a colon cancer microarray dataset aimed at identifying tumor subtypes in the metastatic stage. Supplemental Materials are available online.

Original languageEnglish (US)
Pages (from-to)900-929
Number of pages30
JournalJournal of Computational and Graphical Statistics
Volume19
Issue number4
DOIs
StatePublished - Dec 2010

Keywords

  • Classification
  • Clustering
  • Density estimation
  • Mode/class discovery
  • Patient rule induction method
  • Sparse principal components

ASJC Scopus subject areas

  • Discrete Mathematics and Combinatorics
  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Fingerprint

Dive into the research topics of 'Local sparse bump hunting'. Together they form a unique fingerprint.

Cite this