Estimating gene expression in breast cancer: A hybrid learning framework

Ipek Dursun, Nefel Tellioglu, Murat Elhuseyni, Nurcin Celik

Research output: Contribution to conferencePaper

Abstract

Breast cancer is the most common cancer in women according to WHO's World Cancer Report, 2014. In 2015 alone, 571 000 people died of breast cancer. One of the major causes of cancer formation is epigenetic changes, which are changes in gene function without altering the DNA sequence. Among the various epigenetic mechanisms, DNA methylation, which is the process of the addition or removal of a methyl group to DNA, is the most studied. Methylation levels of CpG islands of a gene is crucial in determination of gene expression. Changes in gene expression may silence the tumor suppressor genes, which may cause cancer. Here, we develop a hybrid supervised learning framework to estimate gene expressions from methylation levels of CpG islands in Xena Browser TCGA Breast Cancer dataset. The proposed hybrid framework is comprised of multiple supervised learning algorithms including a crossbreed algorithm for gene selection and estimation of gene expressions. The crossbreed algorithm is tested against various algorithms known in the literature including multi-linear regression, support vector machine and stochastic gradient boosting. Selected algorithm-specific parameters outperformed the rest of the parameter space for the chosen genes. Competing performances of the algorithms helped identify the significant CpG islands with confidence.

Original languageEnglish (US)
Pages2068-2073
Number of pages6
StatePublished - Jan 1 2018
Event2018 Institute of Industrial and Systems Engineers Annual Conference and Expo, IISE 2018 - Orlando, United States
Duration: May 19 2018May 22 2018

Other

Other2018 Institute of Industrial and Systems Engineers Annual Conference and Expo, IISE 2018
CountryUnited States
CityOrlando
Period5/19/185/22/18

Fingerprint

Gene expression
Genes
Methylation
Supervised learning
DNA sequences
Linear regression
Learning algorithms
Support vector machines
Tumors
DNA

Keywords

  • Breast cancer
  • Epigenetics
  • Gene expression
  • Hybrid supervised learning

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Industrial and Manufacturing Engineering

Cite this

Dursun, I., Tellioglu, N., Elhuseyni, M., & Celik, N. (2018). Estimating gene expression in breast cancer: A hybrid learning framework. 2068-2073. Paper presented at 2018 Institute of Industrial and Systems Engineers Annual Conference and Expo, IISE 2018, Orlando, United States.

Estimating gene expression in breast cancer : A hybrid learning framework. / Dursun, Ipek; Tellioglu, Nefel; Elhuseyni, Murat; Celik, Nurcin.

2018. 2068-2073 Paper presented at 2018 Institute of Industrial and Systems Engineers Annual Conference and Expo, IISE 2018, Orlando, United States.

Research output: Contribution to conferencePaper

Dursun, I, Tellioglu, N, Elhuseyni, M & Celik, N 2018, 'Estimating gene expression in breast cancer: A hybrid learning framework' Paper presented at 2018 Institute of Industrial and Systems Engineers Annual Conference and Expo, IISE 2018, Orlando, United States, 5/19/18 - 5/22/18, pp. 2068-2073.
Dursun I, Tellioglu N, Elhuseyni M, Celik N. Estimating gene expression in breast cancer: A hybrid learning framework. 2018. Paper presented at 2018 Institute of Industrial and Systems Engineers Annual Conference and Expo, IISE 2018, Orlando, United States.
Dursun, Ipek ; Tellioglu, Nefel ; Elhuseyni, Murat ; Celik, Nurcin. / Estimating gene expression in breast cancer : A hybrid learning framework. Paper presented at 2018 Institute of Industrial and Systems Engineers Annual Conference and Expo, IISE 2018, Orlando, United States.6 p.
@conference{713bbbf00f7240edbcaa891530b8f840,
title = "Estimating gene expression in breast cancer: A hybrid learning framework",
abstract = "Breast cancer is the most common cancer in women according to WHO's World Cancer Report, 2014. In 2015 alone, 571 000 people died of breast cancer. One of the major causes of cancer formation is epigenetic changes, which are changes in gene function without altering the DNA sequence. Among the various epigenetic mechanisms, DNA methylation, which is the process of the addition or removal of a methyl group to DNA, is the most studied. Methylation levels of CpG islands of a gene is crucial in determination of gene expression. Changes in gene expression may silence the tumor suppressor genes, which may cause cancer. Here, we develop a hybrid supervised learning framework to estimate gene expressions from methylation levels of CpG islands in Xena Browser TCGA Breast Cancer dataset. The proposed hybrid framework is comprised of multiple supervised learning algorithms including a crossbreed algorithm for gene selection and estimation of gene expressions. The crossbreed algorithm is tested against various algorithms known in the literature including multi-linear regression, support vector machine and stochastic gradient boosting. Selected algorithm-specific parameters outperformed the rest of the parameter space for the chosen genes. Competing performances of the algorithms helped identify the significant CpG islands with confidence.",
keywords = "Breast cancer, Epigenetics, Gene expression, Hybrid supervised learning",
author = "Ipek Dursun and Nefel Tellioglu and Murat Elhuseyni and Nurcin Celik",
year = "2018",
month = "1",
day = "1",
language = "English (US)",
pages = "2068--2073",
note = "2018 Institute of Industrial and Systems Engineers Annual Conference and Expo, IISE 2018 ; Conference date: 19-05-2018 Through 22-05-2018",

}

TY - CONF

T1 - Estimating gene expression in breast cancer

T2 - A hybrid learning framework

AU - Dursun, Ipek

AU - Tellioglu, Nefel

AU - Elhuseyni, Murat

AU - Celik, Nurcin

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Breast cancer is the most common cancer in women according to WHO's World Cancer Report, 2014. In 2015 alone, 571 000 people died of breast cancer. One of the major causes of cancer formation is epigenetic changes, which are changes in gene function without altering the DNA sequence. Among the various epigenetic mechanisms, DNA methylation, which is the process of the addition or removal of a methyl group to DNA, is the most studied. Methylation levels of CpG islands of a gene is crucial in determination of gene expression. Changes in gene expression may silence the tumor suppressor genes, which may cause cancer. Here, we develop a hybrid supervised learning framework to estimate gene expressions from methylation levels of CpG islands in Xena Browser TCGA Breast Cancer dataset. The proposed hybrid framework is comprised of multiple supervised learning algorithms including a crossbreed algorithm for gene selection and estimation of gene expressions. The crossbreed algorithm is tested against various algorithms known in the literature including multi-linear regression, support vector machine and stochastic gradient boosting. Selected algorithm-specific parameters outperformed the rest of the parameter space for the chosen genes. Competing performances of the algorithms helped identify the significant CpG islands with confidence.

AB - Breast cancer is the most common cancer in women according to WHO's World Cancer Report, 2014. In 2015 alone, 571 000 people died of breast cancer. One of the major causes of cancer formation is epigenetic changes, which are changes in gene function without altering the DNA sequence. Among the various epigenetic mechanisms, DNA methylation, which is the process of the addition or removal of a methyl group to DNA, is the most studied. Methylation levels of CpG islands of a gene is crucial in determination of gene expression. Changes in gene expression may silence the tumor suppressor genes, which may cause cancer. Here, we develop a hybrid supervised learning framework to estimate gene expressions from methylation levels of CpG islands in Xena Browser TCGA Breast Cancer dataset. The proposed hybrid framework is comprised of multiple supervised learning algorithms including a crossbreed algorithm for gene selection and estimation of gene expressions. The crossbreed algorithm is tested against various algorithms known in the literature including multi-linear regression, support vector machine and stochastic gradient boosting. Selected algorithm-specific parameters outperformed the rest of the parameter space for the chosen genes. Competing performances of the algorithms helped identify the significant CpG islands with confidence.

KW - Breast cancer

KW - Epigenetics

KW - Gene expression

KW - Hybrid supervised learning

UR - http://www.scopus.com/inward/record.url?scp=85054002415&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85054002415&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:85054002415

SP - 2068

EP - 2073

ER -