The use of genetic programming in the analysis of quantitative gene expression profiles for identification of nodal status in bladder cancer

Anirban P. Mitra, Arpit A. Almal, Ben George, David W. Fry, Peter F. Lenehan, Vincenzo Pagliarulo, Richard J. Cote, Ram H. Datar, William P. Worzel

Research output: Contribution to journalArticle

55 Scopus citations

Abstract

Background: Previous studies on bladder cancer have shown nodal involvement to be an independent indicator of prognosis and survival. This study aimed at developing an objective method for detection of nodal metastasis from molecular profiles of primary urothelial carcinoma tissues. Methods: The study included primary bladder tumor tissues from 60 patients across different stages and 5 control tissues of normal urothelium. The entire cohort was divided into training and validation sets comprised of node positive and node negative subjects. Quantitative expression profiling was performed for a panel of 70 genes using standardized competitive RT-PCR and the expression values of the training set samples were run through an iterative machine learning process called genetic programming that employed an N-fold cross validation technique to generate classifier rules of limited complexity. These were then used in a voting algorithm to classify the validation set samples into those associated with or without nodal metastasis. Results: The generated classifier rules using 70 genes demonstrated 81% accuracy on the validation set when compared to the pathological nodal status. The rules showed a strong predilection for ICAM1:MAP2K6 and KDR resulting in gene expression motifs that cumulatively suggested a pattern ICAM1>MAP2K6>KDR for node positive cases. Additionally, the motifs showed CDK8 to be lower relative to ICAM1, and ANXA5 to be relatively high by itself in node positive tumors. Rules generated using only ICAM1, MAP2K6 and KDR were comparably robust, with a single representative rule producing an accuracy of 90% when used by itself on the validation set, suggesting a crucial role for these genes in nodal metastasis. Conclusion: Our study demonstrates the use of standardized quantitative gene expression values from primary bladder tumor tissues as inputs in a genetic programming system to generate classifier rules for determining the nodal status. Our method also suggests the involvement of ICAM1, MAP2K6, KDR, CDK8 and ANXA5 in unique mathematical combinations in the progression towards nodal positivity. Further studies are needed to identify more class-specific signatures and confirm the role of these genes in the evolution of nodal metastasis in bladder cancer.

Original languageEnglish (US)
Article number159
JournalBMC Cancer
Volume6
DOIs
StatePublished - Jun 16 2006

ASJC Scopus subject areas

  • Genetics
  • Oncology
  • Cancer Research

Fingerprint Dive into the research topics of 'The use of genetic programming in the analysis of quantitative gene expression profiles for identification of nodal status in bladder cancer'. Together they form a unique fingerprint.

  • Cite this

    Mitra, A. P., Almal, A. A., George, B., Fry, D. W., Lenehan, P. F., Pagliarulo, V., Cote, R. J., Datar, R. H., & Worzel, W. P. (2006). The use of genetic programming in the analysis of quantitative gene expression profiles for identification of nodal status in bladder cancer. BMC Cancer, 6, [159]. https://doi.org/10.1186/1471-2407-6-159