Rule mining and classification in imperfect databases

K. K R G K Hewawasam, Kamal Premaratne, S. P. Subasingha, Mei-Ling Shyu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

13 Citations (Scopus)

Abstract

A rule-based classifier learns rules from a set of training data instances with assigned class labels and then uses those rules to assign a class label for a new incoming data instance. To accommodate data imperfections, a probabilistic relational model would represent the attributes by probabilistic functions. One extension to this model uses belief functions instead. Such an approach can represent a wider range of data imperfections. However, the task of extracting frequent patterns and rules from such a "belief theoretic" relational database has to overcome a potentially enormous computational burden. In this work, we present a data structure that is an alternate representation of a belief theoretic relational database. We then develop efficient algorithms to query for belief of itemsets, extract frequent itemsets and generate corresponding association rules from this representation. This set of rules is then used as the basis on which an unknown data instance, whose attributes are represented via belief functions, is classified. These algorithms are tested on a data set collected from a testbed that mimics an airport threat detection and classification scenario where both data attributes and threat class labels may possess imperfections.

Original languageEnglish
Title of host publication2005 7th International Conference on Information Fusion, FUSION
Pages661-668
Number of pages8
Volume1
DOIs
StatePublished - Dec 1 2005
Event2005 7th International Conference on Information Fusion, FUSION - Philadelphia, PA, United States
Duration: Jul 25 2005Jul 28 2005

Other

Other2005 7th International Conference on Information Fusion, FUSION
CountryUnited States
CityPhiladelphia, PA
Period7/25/057/28/05

Fingerprint

Labels
Defects
Association rules
Testbeds
Airports
Data structures
Classifiers

Keywords

  • Association rules
  • Classification
  • Data ambiguities
  • Data imperfections
  • Data mining
  • Dempster-Shafer belief theory

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Hewawasam, K. K. R. G. K., Premaratne, K., Subasingha, S. P., & Shyu, M-L. (2005). Rule mining and classification in imperfect databases. In 2005 7th International Conference on Information Fusion, FUSION (Vol. 1, pp. 661-668). [1591917] https://doi.org/10.1109/ICIF.2005.1591917

Rule mining and classification in imperfect databases. / Hewawasam, K. K R G K; Premaratne, Kamal; Subasingha, S. P.; Shyu, Mei-Ling.

2005 7th International Conference on Information Fusion, FUSION. Vol. 1 2005. p. 661-668 1591917.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Hewawasam, KKRGK, Premaratne, K, Subasingha, SP & Shyu, M-L 2005, Rule mining and classification in imperfect databases. in 2005 7th International Conference on Information Fusion, FUSION. vol. 1, 1591917, pp. 661-668, 2005 7th International Conference on Information Fusion, FUSION, Philadelphia, PA, United States, 7/25/05. https://doi.org/10.1109/ICIF.2005.1591917
Hewawasam KKRGK, Premaratne K, Subasingha SP, Shyu M-L. Rule mining and classification in imperfect databases. In 2005 7th International Conference on Information Fusion, FUSION. Vol. 1. 2005. p. 661-668. 1591917 https://doi.org/10.1109/ICIF.2005.1591917
Hewawasam, K. K R G K ; Premaratne, Kamal ; Subasingha, S. P. ; Shyu, Mei-Ling. / Rule mining and classification in imperfect databases. 2005 7th International Conference on Information Fusion, FUSION. Vol. 1 2005. pp. 661-668
@inproceedings{8601b39d05d147d1bbf30eda369f7add,
title = "Rule mining and classification in imperfect databases",
abstract = "A rule-based classifier learns rules from a set of training data instances with assigned class labels and then uses those rules to assign a class label for a new incoming data instance. To accommodate data imperfections, a probabilistic relational model would represent the attributes by probabilistic functions. One extension to this model uses belief functions instead. Such an approach can represent a wider range of data imperfections. However, the task of extracting frequent patterns and rules from such a {"}belief theoretic{"} relational database has to overcome a potentially enormous computational burden. In this work, we present a data structure that is an alternate representation of a belief theoretic relational database. We then develop efficient algorithms to query for belief of itemsets, extract frequent itemsets and generate corresponding association rules from this representation. This set of rules is then used as the basis on which an unknown data instance, whose attributes are represented via belief functions, is classified. These algorithms are tested on a data set collected from a testbed that mimics an airport threat detection and classification scenario where both data attributes and threat class labels may possess imperfections.",
keywords = "Association rules, Classification, Data ambiguities, Data imperfections, Data mining, Dempster-Shafer belief theory",
author = "Hewawasam, {K. K R G K} and Kamal Premaratne and Subasingha, {S. P.} and Mei-Ling Shyu",
year = "2005",
month = "12",
day = "1",
doi = "10.1109/ICIF.2005.1591917",
language = "English",
isbn = "0780392868",
volume = "1",
pages = "661--668",
booktitle = "2005 7th International Conference on Information Fusion, FUSION",

}

TY - GEN

T1 - Rule mining and classification in imperfect databases

AU - Hewawasam, K. K R G K

AU - Premaratne, Kamal

AU - Subasingha, S. P.

AU - Shyu, Mei-Ling

PY - 2005/12/1

Y1 - 2005/12/1

N2 - A rule-based classifier learns rules from a set of training data instances with assigned class labels and then uses those rules to assign a class label for a new incoming data instance. To accommodate data imperfections, a probabilistic relational model would represent the attributes by probabilistic functions. One extension to this model uses belief functions instead. Such an approach can represent a wider range of data imperfections. However, the task of extracting frequent patterns and rules from such a "belief theoretic" relational database has to overcome a potentially enormous computational burden. In this work, we present a data structure that is an alternate representation of a belief theoretic relational database. We then develop efficient algorithms to query for belief of itemsets, extract frequent itemsets and generate corresponding association rules from this representation. This set of rules is then used as the basis on which an unknown data instance, whose attributes are represented via belief functions, is classified. These algorithms are tested on a data set collected from a testbed that mimics an airport threat detection and classification scenario where both data attributes and threat class labels may possess imperfections.

AB - A rule-based classifier learns rules from a set of training data instances with assigned class labels and then uses those rules to assign a class label for a new incoming data instance. To accommodate data imperfections, a probabilistic relational model would represent the attributes by probabilistic functions. One extension to this model uses belief functions instead. Such an approach can represent a wider range of data imperfections. However, the task of extracting frequent patterns and rules from such a "belief theoretic" relational database has to overcome a potentially enormous computational burden. In this work, we present a data structure that is an alternate representation of a belief theoretic relational database. We then develop efficient algorithms to query for belief of itemsets, extract frequent itemsets and generate corresponding association rules from this representation. This set of rules is then used as the basis on which an unknown data instance, whose attributes are represented via belief functions, is classified. These algorithms are tested on a data set collected from a testbed that mimics an airport threat detection and classification scenario where both data attributes and threat class labels may possess imperfections.

KW - Association rules

KW - Classification

KW - Data ambiguities

KW - Data imperfections

KW - Data mining

KW - Dempster-Shafer belief theory

UR - http://www.scopus.com/inward/record.url?scp=33847138707&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33847138707&partnerID=8YFLogxK

U2 - 10.1109/ICIF.2005.1591917

DO - 10.1109/ICIF.2005.1591917

M3 - Conference contribution

AN - SCOPUS:33847138707

SN - 0780392868

SN - 9780780392861

VL - 1

SP - 661

EP - 668

BT - 2005 7th International Conference on Information Fusion, FUSION

ER -