Entropy-based criterion in categorical clustering

Tao Li, Sheng Ma, Mitsunori Ogihara

Research output: Chapter in Book/Report/Conference proceedingConference contribution

64 Citations (Scopus)

Abstract

Entropy-type measures for the heterogeneity of clusters have been used for a long time. This paper studies the entropy-based criterion in clustering categorical data. It first shows that the entropy-based criterion can be derived in the formal framework of probabilistic clustering models and establishes the connection between the criterion and the approach based on dissimilarity coefficients. An iterative Monte-Carlo procedure is then presented to search for the partitions minimizing the criterion. Experiments are conducted to show the effectiveness of the proposed procedure.

Original languageEnglish (US)
Title of host publicationProceedings, Twenty-First International Conference on Machine Learning, ICML 2004
EditorsR. Greiner, D. Schuurmans
Pages536-543
Number of pages8
StatePublished - 2004
Externally publishedYes
EventProceedings, Twenty-First International Conference on Machine Learning, ICML 2004 - Banff, Alta, Canada
Duration: Jul 4 2004Jul 8 2004

Other

OtherProceedings, Twenty-First International Conference on Machine Learning, ICML 2004
CountryCanada
CityBanff, Alta
Period7/4/047/8/04

Fingerprint

Entropy
Experiments

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Li, T., Ma, S., & Ogihara, M. (2004). Entropy-based criterion in categorical clustering. In R. Greiner, & D. Schuurmans (Eds.), Proceedings, Twenty-First International Conference on Machine Learning, ICML 2004 (pp. 536-543)

Entropy-based criterion in categorical clustering. / Li, Tao; Ma, Sheng; Ogihara, Mitsunori.

Proceedings, Twenty-First International Conference on Machine Learning, ICML 2004. ed. / R. Greiner; D. Schuurmans. 2004. p. 536-543.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Li, T, Ma, S & Ogihara, M 2004, Entropy-based criterion in categorical clustering. in R Greiner & D Schuurmans (eds), Proceedings, Twenty-First International Conference on Machine Learning, ICML 2004. pp. 536-543, Proceedings, Twenty-First International Conference on Machine Learning, ICML 2004, Banff, Alta, Canada, 7/4/04.
Li T, Ma S, Ogihara M. Entropy-based criterion in categorical clustering. In Greiner R, Schuurmans D, editors, Proceedings, Twenty-First International Conference on Machine Learning, ICML 2004. 2004. p. 536-543
Li, Tao ; Ma, Sheng ; Ogihara, Mitsunori. / Entropy-based criterion in categorical clustering. Proceedings, Twenty-First International Conference on Machine Learning, ICML 2004. editor / R. Greiner ; D. Schuurmans. 2004. pp. 536-543
@inproceedings{f973591f24904fcbb660b4fa98f13672,
title = "Entropy-based criterion in categorical clustering",
abstract = "Entropy-type measures for the heterogeneity of clusters have been used for a long time. This paper studies the entropy-based criterion in clustering categorical data. It first shows that the entropy-based criterion can be derived in the formal framework of probabilistic clustering models and establishes the connection between the criterion and the approach based on dissimilarity coefficients. An iterative Monte-Carlo procedure is then presented to search for the partitions minimizing the criterion. Experiments are conducted to show the effectiveness of the proposed procedure.",
author = "Tao Li and Sheng Ma and Mitsunori Ogihara",
year = "2004",
language = "English (US)",
isbn = "1581138385",
pages = "536--543",
editor = "R. Greiner and D. Schuurmans",
booktitle = "Proceedings, Twenty-First International Conference on Machine Learning, ICML 2004",

}

TY - GEN

T1 - Entropy-based criterion in categorical clustering

AU - Li, Tao

AU - Ma, Sheng

AU - Ogihara, Mitsunori

PY - 2004

Y1 - 2004

N2 - Entropy-type measures for the heterogeneity of clusters have been used for a long time. This paper studies the entropy-based criterion in clustering categorical data. It first shows that the entropy-based criterion can be derived in the formal framework of probabilistic clustering models and establishes the connection between the criterion and the approach based on dissimilarity coefficients. An iterative Monte-Carlo procedure is then presented to search for the partitions minimizing the criterion. Experiments are conducted to show the effectiveness of the proposed procedure.

AB - Entropy-type measures for the heterogeneity of clusters have been used for a long time. This paper studies the entropy-based criterion in clustering categorical data. It first shows that the entropy-based criterion can be derived in the formal framework of probabilistic clustering models and establishes the connection between the criterion and the approach based on dissimilarity coefficients. An iterative Monte-Carlo procedure is then presented to search for the partitions minimizing the criterion. Experiments are conducted to show the effectiveness of the proposed procedure.

UR - http://www.scopus.com/inward/record.url?scp=14344259208&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=14344259208&partnerID=8YFLogxK

M3 - Conference contribution

SN - 1581138385

SP - 536

EP - 543

BT - Proceedings, Twenty-First International Conference on Machine Learning, ICML 2004

A2 - Greiner, R.

A2 - Schuurmans, D.

ER -