Stochastic clustering for organizing distributed information sources

Mei-Ling Shyu, Shu Ching Chen, Stuart H. Rubin

Research output: Contribution to journalArticle

10 Citations (Scopus)

Abstract

The number of information sources and the volumes of data in these information sources have greatly increased, which may be attributed to the ever-increasing complexity of real-world applications. The enormous amount of information available in the information sources in a distributed information-providing environment has created a need to provide users with tools to effectively and efficiently navigate and retrieve information. Queries in such an environment often access information from multiple information sources. This may be attributed to navigational characteristics. Clusters provide a structure for organizing the large number of information sources for efficient browsing, searching, and retrieval. This paper presents a stochastically-based clustering mechanism, called the Markov model mediator (MMM), to group the information sources into a set of useful clusters. Each information source cluster groups those information sources that show similarities in their data access behavior. Information sources within the same cluster are expected to be able to provide most of the required information among themselves for user queries that are closely related with respect to a particular, application. This can significantly improve system response time, query performance, and result in an overall improvement in decision support. Empirical studies on real databases are performed and the results demonstrate that our proposed mechanism leads to a better set of clusters in comparison with other clustering methods. This serves to illustrate the effectiveness of our proposed MMM mechanism.

Original languageEnglish
Pages (from-to)2035-2047
Number of pages13
JournalIEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Volume34
Issue number5
DOIs
StatePublished - Oct 1 2004

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Artificial Intelligence
  • Human-Computer Interaction

Cite this

Stochastic clustering for organizing distributed information sources. / Shyu, Mei-Ling; Chen, Shu Ching; Rubin, Stuart H.

In: IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, Vol. 34, No. 5, 01.10.2004, p. 2035-2047.

Research output: Contribution to journalArticle

@article{5e952d6d7804460f8ebc06a390229295,
title = "Stochastic clustering for organizing distributed information sources",
abstract = "The number of information sources and the volumes of data in these information sources have greatly increased, which may be attributed to the ever-increasing complexity of real-world applications. The enormous amount of information available in the information sources in a distributed information-providing environment has created a need to provide users with tools to effectively and efficiently navigate and retrieve information. Queries in such an environment often access information from multiple information sources. This may be attributed to navigational characteristics. Clusters provide a structure for organizing the large number of information sources for efficient browsing, searching, and retrieval. This paper presents a stochastically-based clustering mechanism, called the Markov model mediator (MMM), to group the information sources into a set of useful clusters. Each information source cluster groups those information sources that show similarities in their data access behavior. Information sources within the same cluster are expected to be able to provide most of the required information among themselves for user queries that are closely related with respect to a particular, application. This can significantly improve system response time, query performance, and result in an overall improvement in decision support. Empirical studies on real databases are performed and the results demonstrate that our proposed mechanism leads to a better set of clusters in comparison with other clustering methods. This serves to illustrate the effectiveness of our proposed MMM mechanism.",
author = "Mei-Ling Shyu and Chen, {Shu Ching} and Rubin, {Stuart H.}",
year = "2004",
month = "10",
day = "1",
doi = "10.1109/TSMCB.2004.833599",
language = "English",
volume = "34",
pages = "2035--2047",
journal = "IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics",
issn = "1083-4419",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "5",

}

TY - JOUR

T1 - Stochastic clustering for organizing distributed information sources

AU - Shyu, Mei-Ling

AU - Chen, Shu Ching

AU - Rubin, Stuart H.

PY - 2004/10/1

Y1 - 2004/10/1

N2 - The number of information sources and the volumes of data in these information sources have greatly increased, which may be attributed to the ever-increasing complexity of real-world applications. The enormous amount of information available in the information sources in a distributed information-providing environment has created a need to provide users with tools to effectively and efficiently navigate and retrieve information. Queries in such an environment often access information from multiple information sources. This may be attributed to navigational characteristics. Clusters provide a structure for organizing the large number of information sources for efficient browsing, searching, and retrieval. This paper presents a stochastically-based clustering mechanism, called the Markov model mediator (MMM), to group the information sources into a set of useful clusters. Each information source cluster groups those information sources that show similarities in their data access behavior. Information sources within the same cluster are expected to be able to provide most of the required information among themselves for user queries that are closely related with respect to a particular, application. This can significantly improve system response time, query performance, and result in an overall improvement in decision support. Empirical studies on real databases are performed and the results demonstrate that our proposed mechanism leads to a better set of clusters in comparison with other clustering methods. This serves to illustrate the effectiveness of our proposed MMM mechanism.

AB - The number of information sources and the volumes of data in these information sources have greatly increased, which may be attributed to the ever-increasing complexity of real-world applications. The enormous amount of information available in the information sources in a distributed information-providing environment has created a need to provide users with tools to effectively and efficiently navigate and retrieve information. Queries in such an environment often access information from multiple information sources. This may be attributed to navigational characteristics. Clusters provide a structure for organizing the large number of information sources for efficient browsing, searching, and retrieval. This paper presents a stochastically-based clustering mechanism, called the Markov model mediator (MMM), to group the information sources into a set of useful clusters. Each information source cluster groups those information sources that show similarities in their data access behavior. Information sources within the same cluster are expected to be able to provide most of the required information among themselves for user queries that are closely related with respect to a particular, application. This can significantly improve system response time, query performance, and result in an overall improvement in decision support. Empirical studies on real databases are performed and the results demonstrate that our proposed mechanism leads to a better set of clusters in comparison with other clustering methods. This serves to illustrate the effectiveness of our proposed MMM mechanism.

UR - http://www.scopus.com/inward/record.url?scp=4844220829&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=4844220829&partnerID=8YFLogxK

U2 - 10.1109/TSMCB.2004.833599

DO - 10.1109/TSMCB.2004.833599

M3 - Article

C2 - 15503499

AN - SCOPUS:4844220829

VL - 34

SP - 2035

EP - 2047

JO - IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

JF - IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

SN - 1083-4419

IS - 5

ER -