CoFD: An algorithm for non-distance based clustering in high dimensional spaces

Shenghuo Zhu, Tao Li, Mitsunori Ogihara

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

The clustering problem, which aims at identifying the distribution of patterns and intrinsic correlations in large data sets by partitioning the data points into similarity clusters, has been widely studied. Traditional clustering algorithms use distance functions to measure similarity and are not suitable for high dimensional spaces. In this paper, we propose CoFD algorithm, which is a non-distance based clustering algorithm for high dimensional spaces. Based on the maximum likelihood principle, CoFD is to optimize parameters to maximize the likelihood between data points and the modelgenerated by the parameters. Experimental results on both synthetic data sets and a realdata set show the efficiency and effectiveness of CoFD.

Original languageEnglish (US)
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages52-62
Number of pages11
Volume2454 LNCS
StatePublished - 2002
Externally publishedYes
Event4th International Conference on Data Warehousing and Knowledge Discovery, DaWaK 2002 - Aix-en-Provence, France
Duration: Sep 4 2002Sep 6 2002

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2454 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other4th International Conference on Data Warehousing and Knowledge Discovery, DaWaK 2002
CountryFrance
CityAix-en-Provence
Period9/4/029/6/02

Fingerprint

Clustering algorithms
Clustering Algorithm
High-dimensional
Clustering
Likelihood Principle
Synthetic Data
Distance Function
Similarity Measure
Large Data Sets
Maximum likelihood
Maximum Likelihood
Partitioning
Likelihood
Maximise
Optimise
Experimental Results
Similarity

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Zhu, S., Li, T., & Ogihara, M. (2002). CoFD: An algorithm for non-distance based clustering in high dimensional spaces. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2454 LNCS, pp. 52-62). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 2454 LNCS).

CoFD : An algorithm for non-distance based clustering in high dimensional spaces. / Zhu, Shenghuo; Li, Tao; Ogihara, Mitsunori.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 2454 LNCS 2002. p. 52-62 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 2454 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zhu, S, Li, T & Ogihara, M 2002, CoFD: An algorithm for non-distance based clustering in high dimensional spaces. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 2454 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 2454 LNCS, pp. 52-62, 4th International Conference on Data Warehousing and Knowledge Discovery, DaWaK 2002, Aix-en-Provence, France, 9/4/02.
Zhu S, Li T, Ogihara M. CoFD: An algorithm for non-distance based clustering in high dimensional spaces. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 2454 LNCS. 2002. p. 52-62. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Zhu, Shenghuo ; Li, Tao ; Ogihara, Mitsunori. / CoFD : An algorithm for non-distance based clustering in high dimensional spaces. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 2454 LNCS 2002. pp. 52-62 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{0cb4b3b14cc54bc3a45b54697190bb2e,
title = "CoFD: An algorithm for non-distance based clustering in high dimensional spaces",
abstract = "The clustering problem, which aims at identifying the distribution of patterns and intrinsic correlations in large data sets by partitioning the data points into similarity clusters, has been widely studied. Traditional clustering algorithms use distance functions to measure similarity and are not suitable for high dimensional spaces. In this paper, we propose CoFD algorithm, which is a non-distance based clustering algorithm for high dimensional spaces. Based on the maximum likelihood principle, CoFD is to optimize parameters to maximize the likelihood between data points and the modelgenerated by the parameters. Experimental results on both synthetic data sets and a realdata set show the efficiency and effectiveness of CoFD.",
author = "Shenghuo Zhu and Tao Li and Mitsunori Ogihara",
year = "2002",
language = "English (US)",
isbn = "3540441239",
volume = "2454 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "52--62",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - CoFD

T2 - An algorithm for non-distance based clustering in high dimensional spaces

AU - Zhu, Shenghuo

AU - Li, Tao

AU - Ogihara, Mitsunori

PY - 2002

Y1 - 2002

N2 - The clustering problem, which aims at identifying the distribution of patterns and intrinsic correlations in large data sets by partitioning the data points into similarity clusters, has been widely studied. Traditional clustering algorithms use distance functions to measure similarity and are not suitable for high dimensional spaces. In this paper, we propose CoFD algorithm, which is a non-distance based clustering algorithm for high dimensional spaces. Based on the maximum likelihood principle, CoFD is to optimize parameters to maximize the likelihood between data points and the modelgenerated by the parameters. Experimental results on both synthetic data sets and a realdata set show the efficiency and effectiveness of CoFD.

AB - The clustering problem, which aims at identifying the distribution of patterns and intrinsic correlations in large data sets by partitioning the data points into similarity clusters, has been widely studied. Traditional clustering algorithms use distance functions to measure similarity and are not suitable for high dimensional spaces. In this paper, we propose CoFD algorithm, which is a non-distance based clustering algorithm for high dimensional spaces. Based on the maximum likelihood principle, CoFD is to optimize parameters to maximize the likelihood between data points and the modelgenerated by the parameters. Experimental results on both synthetic data sets and a realdata set show the efficiency and effectiveness of CoFD.

UR - http://www.scopus.com/inward/record.url?scp=84864857888&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84864857888&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84864857888

SN - 3540441239

SN - 9783540441236

VL - 2454 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 52

EP - 62

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -