Simultaneous clustering and estimation of heterogeneous graphical models

Botao Hao, Wei Sun, Yufeng Liu, Guang Cheng

Research output: Contribution to journalArticle

Abstract

We consider joint estimation of multiple graphical models arising from heterogeneous and high-dimensional observations. Unlike most previous approaches which assume that the cluster structure is given in advance, an appealing feature of our method is to learn cluster structure while estimating heterogeneous graphical models. This is achieved via a high dimensional version of Expectation Conditional Maximization (ECM) algorithm (Meng and Rubin, 1993). A joint graphical lasso penalty is imposed on the conditional maximization step to extract both homogeneity and heterogeneity components across all clusters. Our algorithm is computationally efficient due to fast sparse learning routines and can be implemented without unsupervised learning knowledge. The superior performance of our method is demonstrated by extensive experiments and its application to a Glioblastoma cancer dataset reveals some new insights in understanding the Glioblastoma cancer. In theory, a non-asymptotic error bound is established for the output directly from our high dimensional ECM algorithm, and it consists of two quantities: statistical error (statistical accuracy) and optimization error (computational complexity). Such a result gives a theoretical guideline in terminating our ECM iterations.

Original languageEnglish (US)
Pages (from-to)1-58
Number of pages58
JournalJournal of Machine Learning Research
Volume18
StatePublished - Apr 1 2018

Fingerprint

Conditional Expectation
Graphical Models
High-dimensional
Clustering
Cancer
Lasso
Unsupervised Learning
Multiple Models
Homogeneity
Error Bounds
Penalty
Computational Complexity
Unsupervised learning
Iteration
Optimization
Output
Computational complexity
Experiment
Experiments

Keywords

  • Clustering
  • Finite-sample analysis
  • Graphical models
  • High-dimensional statistics
  • Non-convex optimization

ASJC Scopus subject areas

  • Software
  • Control and Systems Engineering
  • Statistics and Probability
  • Artificial Intelligence

Cite this

Simultaneous clustering and estimation of heterogeneous graphical models. / Hao, Botao; Sun, Wei; Liu, Yufeng; Cheng, Guang.

In: Journal of Machine Learning Research, Vol. 18, 01.04.2018, p. 1-58.

Research output: Contribution to journalArticle

@article{53df7a01d3294d3880dab71e657a7275,
title = "Simultaneous clustering and estimation of heterogeneous graphical models",
abstract = "We consider joint estimation of multiple graphical models arising from heterogeneous and high-dimensional observations. Unlike most previous approaches which assume that the cluster structure is given in advance, an appealing feature of our method is to learn cluster structure while estimating heterogeneous graphical models. This is achieved via a high dimensional version of Expectation Conditional Maximization (ECM) algorithm (Meng and Rubin, 1993). A joint graphical lasso penalty is imposed on the conditional maximization step to extract both homogeneity and heterogeneity components across all clusters. Our algorithm is computationally efficient due to fast sparse learning routines and can be implemented without unsupervised learning knowledge. The superior performance of our method is demonstrated by extensive experiments and its application to a Glioblastoma cancer dataset reveals some new insights in understanding the Glioblastoma cancer. In theory, a non-asymptotic error bound is established for the output directly from our high dimensional ECM algorithm, and it consists of two quantities: statistical error (statistical accuracy) and optimization error (computational complexity). Such a result gives a theoretical guideline in terminating our ECM iterations.",
keywords = "Clustering, Finite-sample analysis, Graphical models, High-dimensional statistics, Non-convex optimization",
author = "Botao Hao and Wei Sun and Yufeng Liu and Guang Cheng",
year = "2018",
month = "4",
day = "1",
language = "English (US)",
volume = "18",
pages = "1--58",
journal = "Journal of Machine Learning Research",
issn = "1532-4435",
publisher = "Microtome Publishing",

}

TY - JOUR

T1 - Simultaneous clustering and estimation of heterogeneous graphical models

AU - Hao, Botao

AU - Sun, Wei

AU - Liu, Yufeng

AU - Cheng, Guang

PY - 2018/4/1

Y1 - 2018/4/1

N2 - We consider joint estimation of multiple graphical models arising from heterogeneous and high-dimensional observations. Unlike most previous approaches which assume that the cluster structure is given in advance, an appealing feature of our method is to learn cluster structure while estimating heterogeneous graphical models. This is achieved via a high dimensional version of Expectation Conditional Maximization (ECM) algorithm (Meng and Rubin, 1993). A joint graphical lasso penalty is imposed on the conditional maximization step to extract both homogeneity and heterogeneity components across all clusters. Our algorithm is computationally efficient due to fast sparse learning routines and can be implemented without unsupervised learning knowledge. The superior performance of our method is demonstrated by extensive experiments and its application to a Glioblastoma cancer dataset reveals some new insights in understanding the Glioblastoma cancer. In theory, a non-asymptotic error bound is established for the output directly from our high dimensional ECM algorithm, and it consists of two quantities: statistical error (statistical accuracy) and optimization error (computational complexity). Such a result gives a theoretical guideline in terminating our ECM iterations.

AB - We consider joint estimation of multiple graphical models arising from heterogeneous and high-dimensional observations. Unlike most previous approaches which assume that the cluster structure is given in advance, an appealing feature of our method is to learn cluster structure while estimating heterogeneous graphical models. This is achieved via a high dimensional version of Expectation Conditional Maximization (ECM) algorithm (Meng and Rubin, 1993). A joint graphical lasso penalty is imposed on the conditional maximization step to extract both homogeneity and heterogeneity components across all clusters. Our algorithm is computationally efficient due to fast sparse learning routines and can be implemented without unsupervised learning knowledge. The superior performance of our method is demonstrated by extensive experiments and its application to a Glioblastoma cancer dataset reveals some new insights in understanding the Glioblastoma cancer. In theory, a non-asymptotic error bound is established for the output directly from our high dimensional ECM algorithm, and it consists of two quantities: statistical error (statistical accuracy) and optimization error (computational complexity). Such a result gives a theoretical guideline in terminating our ECM iterations.

KW - Clustering

KW - Finite-sample analysis

KW - Graphical models

KW - High-dimensional statistics

KW - Non-convex optimization

UR - http://www.scopus.com/inward/record.url?scp=85048927226&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85048927226&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:85048927226

VL - 18

SP - 1

EP - 58

JO - Journal of Machine Learning Research

JF - Journal of Machine Learning Research

SN - 1532-4435

ER -