A reduction technique for nearest-neighbor classification: Small groups of examples

Miroslav Kubat, Martin Cooperson

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

An important issue in nearest-neighbor classifiers is how to reduce the size of large sets of examples. Whereas many researchers recommend to replace the original set with a carefully selected subset, we investigate a mechanism that creates three or more such subsets. The idea is to make sure that each of them, when used as a 1-NN subclassifier, tends to err in a different part of the instance space. In this case, failures of individuals can be corrected by voting. The costs of our example-selection procedure are linear in the size of the original training set and, as our experiments demonstrate, dramatic data reduction can be achieved without a major drop in classification accuracy.

Original languageEnglish
Pages (from-to)463-476
Number of pages14
JournalIntelligent Data Analysis
Volume5
Issue number6
StatePublished - Dec 1 2001

Fingerprint

Nearest Neighbor
Data reduction
Classifiers
Subset
Data Reduction
Selection Procedures
Voting
Large Set
Costs
Experiments
Classifier
Tend
Demonstrate
Experiment
Training

Keywords

  • example selection
  • nearest-neighbor classifiers
  • voting

ASJC Scopus subject areas

  • Artificial Intelligence
  • Theoretical Computer Science
  • Computer Vision and Pattern Recognition

Cite this

A reduction technique for nearest-neighbor classification : Small groups of examples. / Kubat, Miroslav; Cooperson, Martin.

In: Intelligent Data Analysis, Vol. 5, No. 6, 01.12.2001, p. 463-476.

Research output: Contribution to journalArticle

@article{7068ae92222346c8adaedd314e008cab,
title = "A reduction technique for nearest-neighbor classification: Small groups of examples",
abstract = "An important issue in nearest-neighbor classifiers is how to reduce the size of large sets of examples. Whereas many researchers recommend to replace the original set with a carefully selected subset, we investigate a mechanism that creates three or more such subsets. The idea is to make sure that each of them, when used as a 1-NN subclassifier, tends to err in a different part of the instance space. In this case, failures of individuals can be corrected by voting. The costs of our example-selection procedure are linear in the size of the original training set and, as our experiments demonstrate, dramatic data reduction can be achieved without a major drop in classification accuracy.",
keywords = "example selection, nearest-neighbor classifiers, voting",
author = "Miroslav Kubat and Martin Cooperson",
year = "2001",
month = "12",
day = "1",
language = "English",
volume = "5",
pages = "463--476",
journal = "Intelligent Data Analysis",
issn = "1088-467X",
publisher = "IOS Press",
number = "6",

}

TY - JOUR

T1 - A reduction technique for nearest-neighbor classification

T2 - Small groups of examples

AU - Kubat, Miroslav

AU - Cooperson, Martin

PY - 2001/12/1

Y1 - 2001/12/1

N2 - An important issue in nearest-neighbor classifiers is how to reduce the size of large sets of examples. Whereas many researchers recommend to replace the original set with a carefully selected subset, we investigate a mechanism that creates three or more such subsets. The idea is to make sure that each of them, when used as a 1-NN subclassifier, tends to err in a different part of the instance space. In this case, failures of individuals can be corrected by voting. The costs of our example-selection procedure are linear in the size of the original training set and, as our experiments demonstrate, dramatic data reduction can be achieved without a major drop in classification accuracy.

AB - An important issue in nearest-neighbor classifiers is how to reduce the size of large sets of examples. Whereas many researchers recommend to replace the original set with a carefully selected subset, we investigate a mechanism that creates three or more such subsets. The idea is to make sure that each of them, when used as a 1-NN subclassifier, tends to err in a different part of the instance space. In this case, failures of individuals can be corrected by voting. The costs of our example-selection procedure are linear in the size of the original training set and, as our experiments demonstrate, dramatic data reduction can be achieved without a major drop in classification accuracy.

KW - example selection

KW - nearest-neighbor classifiers

KW - voting

UR - http://www.scopus.com/inward/record.url?scp=46149086948&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=46149086948&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:46149086948

VL - 5

SP - 463

EP - 476

JO - Intelligent Data Analysis

JF - Intelligent Data Analysis

SN - 1088-467X

IS - 6

ER -