Intra-cluster distance minimization in DNA methylation analysis using an advanced Tabu-based iterative k-medoids clustering algorithm (T-CLUST)

Haluk Damgacioglu, Emrah Celik, Nurcin Celik

Research output: Contribution to journalArticle

2 Scopus citations

Abstract

Recent advances in DNA methylation profiling have paved the way for understanding the underlying epigenetic mechanisms of various diseases such as cancer. While conventional distance-based clustering algorithms (e.g., hierarchical and k-means clustering) have been heavily used in such profiling owing to their speed in conduct of high-throughput analysis, these methods commonly converge to suboptimal solutions and/or trivial clusters due to their greedy search nature. Hence, methodologies are needed to improve the quality of clusters formed by these algorithms without sacrificing from their speed. In this study, we introduce three related algorithms for a complete high-throughput methylation analysis: a variance-based dimension reduction algorithm to handle high-dimensionality in data, an outlier detection algorithm to identify the outliers of data, and an advanced Tabu-based iterative k-medoids clustering algorithm (T-CLUST) to reduce the impact of initial solutions on the performance of conventional k-medoids algorithm. The performance of the proposed algorithms is demonstrated on nine different real DNA methylation datasets obtained from the Gene Expression Omnibus DataSets database. The accuracy of the cluster identification obtained by our proposed algorithms is higher than those of hierarchical and k-means clustering, as well as the conventional methods. The algorithms are implemented in MATLAB, and available at: http://www.coe.miami.edu/simlab/tclust.html.

Keywords

  • Biomarker identification
  • DNA methylation analysis
  • clustering
  • k-medoids clustering
  • outlier detection

ASJC Scopus subject areas

  • Biotechnology
  • Genetics
  • Applied Mathematics

Fingerprint Dive into the research topics of 'Intra-cluster distance minimization in DNA methylation analysis using an advanced Tabu-based iterative k-medoids clustering algorithm (T-CLUST)'. Together they form a unique fingerprint.

  • Cite this