Local sparse bump hunting reveals molecular heterogeneity of colon tumors

Jean Eudes Dazard, Jonnagadda S Rao, Sanford Markowitz

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

The question of molecular heterogeneity and of tumoral phenotype in cancer remains unresolved. To understand the underlying molecular basis of this phenomenon, we analyzed genome-wide expression data of colon cancer metastasis samples, as these tumors are the most advanced and hence would be anticipated to be the most likely heterogeneous group of tumors, potentially exhibiting the maximum amount of genetic heterogeneity. Casting a statistical net around such a complex problem proves difficult because of the high dimensionality and multicollinearity of the gene expression space, combined with the fact that genes act in concert with one another and that not all genes surveyed might be involved. We devise a strategy to identify distinct subgroups of samples and determine the genetic/molecular signature that defines them. This involves use of the local sparse bump hunting algorithm, which provides a much more optimal and biologically faithful transformed space within which to search for bumps. In addition, thanks to the variable selection feature of the algorithm, we derived a novel sparse gene expression signature, which appears to divide all colon cancer patients into two populations: a population whose expression pattern can be molecularly encompassed within the bump and an outlier population that cannot be. Although all patients within any given stage of the disease, including the metastatic group, appear clinically homogeneous, our procedure revealed two subgroups in each stage with distinct genetic/molecular profiles. We also discuss implications of such a finding in terms of early detection, diagnosis and prognosis.

Original languageEnglish
Pages (from-to)1203-1220
Number of pages18
JournalStatistics in Medicine
Volume31
Issue number11-12
DOIs
StatePublished - May 1 2012

Fingerprint

Tumor
Cancer
Colon
Colonic Neoplasms
Gene Expression
Molecular Biology
Signature
Subgroup
Population
Gene
Distinct
Multicollinearity
Neoplasms
Metastasis
Genetic Heterogeneity
Prognosis
Variable Selection
Casting
Faithful
Transcriptome

Keywords

  • Class discovery
  • Colon cancer patient subtyping
  • Early diagnosis and prognosis
  • Local sparse bump hunting
  • Mixture density

ASJC Scopus subject areas

  • Epidemiology
  • Statistics and Probability

Cite this

Local sparse bump hunting reveals molecular heterogeneity of colon tumors. / Dazard, Jean Eudes; Rao, Jonnagadda S; Markowitz, Sanford.

In: Statistics in Medicine, Vol. 31, No. 11-12, 01.05.2012, p. 1203-1220.

Research output: Contribution to journalArticle

Dazard, Jean Eudes ; Rao, Jonnagadda S ; Markowitz, Sanford. / Local sparse bump hunting reveals molecular heterogeneity of colon tumors. In: Statistics in Medicine. 2012 ; Vol. 31, No. 11-12. pp. 1203-1220.
@article{42725319e1ae4da5bd1d260e220ede98,
title = "Local sparse bump hunting reveals molecular heterogeneity of colon tumors",
abstract = "The question of molecular heterogeneity and of tumoral phenotype in cancer remains unresolved. To understand the underlying molecular basis of this phenomenon, we analyzed genome-wide expression data of colon cancer metastasis samples, as these tumors are the most advanced and hence would be anticipated to be the most likely heterogeneous group of tumors, potentially exhibiting the maximum amount of genetic heterogeneity. Casting a statistical net around such a complex problem proves difficult because of the high dimensionality and multicollinearity of the gene expression space, combined with the fact that genes act in concert with one another and that not all genes surveyed might be involved. We devise a strategy to identify distinct subgroups of samples and determine the genetic/molecular signature that defines them. This involves use of the local sparse bump hunting algorithm, which provides a much more optimal and biologically faithful transformed space within which to search for bumps. In addition, thanks to the variable selection feature of the algorithm, we derived a novel sparse gene expression signature, which appears to divide all colon cancer patients into two populations: a population whose expression pattern can be molecularly encompassed within the bump and an outlier population that cannot be. Although all patients within any given stage of the disease, including the metastatic group, appear clinically homogeneous, our procedure revealed two subgroups in each stage with distinct genetic/molecular profiles. We also discuss implications of such a finding in terms of early detection, diagnosis and prognosis.",
keywords = "Class discovery, Colon cancer patient subtyping, Early diagnosis and prognosis, Local sparse bump hunting, Mixture density",
author = "Dazard, {Jean Eudes} and Rao, {Jonnagadda S} and Sanford Markowitz",
year = "2012",
month = "5",
day = "1",
doi = "10.1002/sim.4389",
language = "English",
volume = "31",
pages = "1203--1220",
journal = "Statistics in Medicine",
issn = "0277-6715",
publisher = "John Wiley and Sons Ltd",
number = "11-12",

}

TY - JOUR

T1 - Local sparse bump hunting reveals molecular heterogeneity of colon tumors

AU - Dazard, Jean Eudes

AU - Rao, Jonnagadda S

AU - Markowitz, Sanford

PY - 2012/5/1

Y1 - 2012/5/1

N2 - The question of molecular heterogeneity and of tumoral phenotype in cancer remains unresolved. To understand the underlying molecular basis of this phenomenon, we analyzed genome-wide expression data of colon cancer metastasis samples, as these tumors are the most advanced and hence would be anticipated to be the most likely heterogeneous group of tumors, potentially exhibiting the maximum amount of genetic heterogeneity. Casting a statistical net around such a complex problem proves difficult because of the high dimensionality and multicollinearity of the gene expression space, combined with the fact that genes act in concert with one another and that not all genes surveyed might be involved. We devise a strategy to identify distinct subgroups of samples and determine the genetic/molecular signature that defines them. This involves use of the local sparse bump hunting algorithm, which provides a much more optimal and biologically faithful transformed space within which to search for bumps. In addition, thanks to the variable selection feature of the algorithm, we derived a novel sparse gene expression signature, which appears to divide all colon cancer patients into two populations: a population whose expression pattern can be molecularly encompassed within the bump and an outlier population that cannot be. Although all patients within any given stage of the disease, including the metastatic group, appear clinically homogeneous, our procedure revealed two subgroups in each stage with distinct genetic/molecular profiles. We also discuss implications of such a finding in terms of early detection, diagnosis and prognosis.

AB - The question of molecular heterogeneity and of tumoral phenotype in cancer remains unresolved. To understand the underlying molecular basis of this phenomenon, we analyzed genome-wide expression data of colon cancer metastasis samples, as these tumors are the most advanced and hence would be anticipated to be the most likely heterogeneous group of tumors, potentially exhibiting the maximum amount of genetic heterogeneity. Casting a statistical net around such a complex problem proves difficult because of the high dimensionality and multicollinearity of the gene expression space, combined with the fact that genes act in concert with one another and that not all genes surveyed might be involved. We devise a strategy to identify distinct subgroups of samples and determine the genetic/molecular signature that defines them. This involves use of the local sparse bump hunting algorithm, which provides a much more optimal and biologically faithful transformed space within which to search for bumps. In addition, thanks to the variable selection feature of the algorithm, we derived a novel sparse gene expression signature, which appears to divide all colon cancer patients into two populations: a population whose expression pattern can be molecularly encompassed within the bump and an outlier population that cannot be. Although all patients within any given stage of the disease, including the metastatic group, appear clinically homogeneous, our procedure revealed two subgroups in each stage with distinct genetic/molecular profiles. We also discuss implications of such a finding in terms of early detection, diagnosis and prognosis.

KW - Class discovery

KW - Colon cancer patient subtyping

KW - Early diagnosis and prognosis

KW - Local sparse bump hunting

KW - Mixture density

UR - http://www.scopus.com/inward/record.url?scp=84861203188&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84861203188&partnerID=8YFLogxK

U2 - 10.1002/sim.4389

DO - 10.1002/sim.4389

M3 - Article

C2 - 22052459

AN - SCOPUS:84861203188

VL - 31

SP - 1203

EP - 1220

JO - Statistics in Medicine

JF - Statistics in Medicine

SN - 0277-6715

IS - 11-12

ER -