Document clustering via adaptive subspace iteration

Tao Li, Sheng Ma, Mitsunori Ogihara

Research output: Chapter in Book/Report/Conference proceedingConference contribution

80 Scopus citations

Abstract

Document clustering has long been an important problem in information retrieval. In this paper, we present a new clustering algorithm ASI1, which uses explicitly modeling of the subspace structure associated with each cluster. ASI simultaneously performs data reduction and subspace identification via an iterative alternating optimization procedure. Motivated from the optimization procedure, we then provide a novel method to determine the number of clusters. We also discuss the connections of ASI with various existential clustering approaches. Finally, extensive experimental results on real data sets show the effectiveness of ASI algorithm.

Original languageEnglish (US)
Title of host publicationProceedings of Sheffield SIGIR - Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
PublisherAssociation for Computing Machinery
Pages218-225
Number of pages8
ISBN (Print)1581138814, 9781581138818
DOIs
StatePublished - 2004
EventProceedings of Sheffield SIGIR - Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval - Sheffield, United Kingdom
Duration: Jul 25 2004Jul 29 2004

Publication series

NameProceedings of Sheffield SIGIR - Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval

Other

OtherProceedings of Sheffield SIGIR - Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
CountryUnited Kingdom
CitySheffield
Period7/25/047/29/04

Keywords

  • Adaptive subspace identification
  • Alternating optimization
  • Document clustering
  • Factor analysis

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint Dive into the research topics of 'Document clustering via adaptive subspace iteration'. Together they form a unique fingerprint.

  • Cite this

    Li, T., Ma, S., & Ogihara, M. (2004). Document clustering via adaptive subspace iteration. In Proceedings of Sheffield SIGIR - Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 218-225). (Proceedings of Sheffield SIGIR - Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval). Association for Computing Machinery. https://doi.org/10.1145/1008992.1009031