TY - GEN
T1 - Document clustering via adaptive subspace iteration
AU - Li, Tao
AU - Ma, Sheng
AU - Ogihara, Mitsunori
PY - 2004
Y1 - 2004
N2 - Document clustering has long been an important problem in information retrieval. In this paper, we present a new clustering algorithm ASI1, which uses explicitly modeling of the subspace structure associated with each cluster. ASI simultaneously performs data reduction and subspace identification via an iterative alternating optimization procedure. Motivated from the optimization procedure, we then provide a novel method to determine the number of clusters. We also discuss the connections of ASI with various existential clustering approaches. Finally, extensive experimental results on real data sets show the effectiveness of ASI algorithm.
AB - Document clustering has long been an important problem in information retrieval. In this paper, we present a new clustering algorithm ASI1, which uses explicitly modeling of the subspace structure associated with each cluster. ASI simultaneously performs data reduction and subspace identification via an iterative alternating optimization procedure. Motivated from the optimization procedure, we then provide a novel method to determine the number of clusters. We also discuss the connections of ASI with various existential clustering approaches. Finally, extensive experimental results on real data sets show the effectiveness of ASI algorithm.
KW - Adaptive subspace identification
KW - Alternating optimization
KW - Document clustering
KW - Factor analysis
UR - http://www.scopus.com/inward/record.url?scp=8644250640&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=8644250640&partnerID=8YFLogxK
U2 - 10.1145/1008992.1009031
DO - 10.1145/1008992.1009031
M3 - Conference contribution
AN - SCOPUS:8644250640
SN - 1581138814
SN - 9781581138818
T3 - Proceedings of Sheffield SIGIR - Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
SP - 218
EP - 225
BT - Proceedings of Sheffield SIGIR - Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
PB - Association for Computing Machinery
T2 - Proceedings of Sheffield SIGIR - Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
Y2 - 25 July 2004 through 29 July 2004
ER -