Handling ambiguous values in instance-based classifiers

Hans Holland, Miroslav Kubat, Jan Žižka

Research output: Contribution to journalArticle

Abstract

In an attempt to automate evaluation of network intrusion detection systems, we encountered the problem of ambiguously described learning examples. For instance, an attributes value, or a class label, in a given example was known to be a or b but definitely not c or d. Previous research in machine learning usually either "disambiguated" the value (by giving preference to a or b), or replaced it with a "dont-know" symbol. Neither approach is satisfactory: while the former distorts the available information by pretending precise knowledge, the latter ignores the fact that at least something is known. Our experiments confirm the intuition that classification performance is indeed impaired if the ambiguities are not handled properly. In the research reported here, we limited ourselves to the realm of the relatively simple nearest-neighbor classifiers and investigated a few alternative solutions. The paper describes the techniques we used and describes their behavior in experimental domains.

Original languageEnglish (US)
Pages (from-to)449-463
Number of pages15
JournalInternational Journal on Artificial Intelligence Tools
Volume17
Issue number3
DOIs
StatePublished - Jun 1 2008

    Fingerprint

Keywords

  • Ambiguous attributes
  • Instance-based classifiers

ASJC Scopus subject areas

  • Artificial Intelligence

Cite this