Induction in multi-label text classification domains

Miroslav Kubat, Kanoksri Sarinnapakorn, Sareewan Dendamrongvit

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Automated classification of text documents has two distinctive aspects. First, each training or testing example can be labeled with more than two classes at the same time-this has serious consequences not only for the induction algorithms, but also for how we evaluate the performance of the induced classifier. Second, the examples are usually described by great many attributes, which makes induction from hundreds of thousands of training examples prohibitively expensive. Both issues have been addressed by recent machine-learning literature, but the behaviors of existing solutions in real-world domains are still far from satisfactory. Here, we describe our own technique and report experiments with a concrete text database.

Original languageEnglish
Title of host publicationStudies in Computational Intelligence
Pages225-244
Number of pages20
Volume263
DOIs
StatePublished - Jan 19 2010

Publication series

NameStudies in Computational Intelligence
Volume263
ISSN (Print)1860949X

Fingerprint

Learning systems
Labels
Classifiers
Concretes
Testing
Experiments

Keywords

  • Classifier induction
  • Dempster-Shafer theory
  • Information fusion
  • Multi-label examples
  • Text classification

ASJC Scopus subject areas

  • Artificial Intelligence

Cite this

Kubat, M., Sarinnapakorn, K., & Dendamrongvit, S. (2010). Induction in multi-label text classification domains. In Studies in Computational Intelligence (Vol. 263, pp. 225-244). (Studies in Computational Intelligence; Vol. 263). https://doi.org/10.1007/978-3-642-05179-1_11

Induction in multi-label text classification domains. / Kubat, Miroslav; Sarinnapakorn, Kanoksri; Dendamrongvit, Sareewan.

Studies in Computational Intelligence. Vol. 263 2010. p. 225-244 (Studies in Computational Intelligence; Vol. 263).

Research output: Chapter in Book/Report/Conference proceedingChapter

Kubat, M, Sarinnapakorn, K & Dendamrongvit, S 2010, Induction in multi-label text classification domains. in Studies in Computational Intelligence. vol. 263, Studies in Computational Intelligence, vol. 263, pp. 225-244. https://doi.org/10.1007/978-3-642-05179-1_11
Kubat M, Sarinnapakorn K, Dendamrongvit S. Induction in multi-label text classification domains. In Studies in Computational Intelligence. Vol. 263. 2010. p. 225-244. (Studies in Computational Intelligence). https://doi.org/10.1007/978-3-642-05179-1_11
Kubat, Miroslav ; Sarinnapakorn, Kanoksri ; Dendamrongvit, Sareewan. / Induction in multi-label text classification domains. Studies in Computational Intelligence. Vol. 263 2010. pp. 225-244 (Studies in Computational Intelligence).
@inbook{cf1422db817a4687a34de19a05b1dc51,
title = "Induction in multi-label text classification domains",
abstract = "Automated classification of text documents has two distinctive aspects. First, each training or testing example can be labeled with more than two classes at the same time-this has serious consequences not only for the induction algorithms, but also for how we evaluate the performance of the induced classifier. Second, the examples are usually described by great many attributes, which makes induction from hundreds of thousands of training examples prohibitively expensive. Both issues have been addressed by recent machine-learning literature, but the behaviors of existing solutions in real-world domains are still far from satisfactory. Here, we describe our own technique and report experiments with a concrete text database.",
keywords = "Classifier induction, Dempster-Shafer theory, Information fusion, Multi-label examples, Text classification",
author = "Miroslav Kubat and Kanoksri Sarinnapakorn and Sareewan Dendamrongvit",
year = "2010",
month = "1",
day = "19",
doi = "10.1007/978-3-642-05179-1_11",
language = "English",
isbn = "9783642051784",
volume = "263",
series = "Studies in Computational Intelligence",
pages = "225--244",
booktitle = "Studies in Computational Intelligence",

}

TY - CHAP

T1 - Induction in multi-label text classification domains

AU - Kubat, Miroslav

AU - Sarinnapakorn, Kanoksri

AU - Dendamrongvit, Sareewan

PY - 2010/1/19

Y1 - 2010/1/19

N2 - Automated classification of text documents has two distinctive aspects. First, each training or testing example can be labeled with more than two classes at the same time-this has serious consequences not only for the induction algorithms, but also for how we evaluate the performance of the induced classifier. Second, the examples are usually described by great many attributes, which makes induction from hundreds of thousands of training examples prohibitively expensive. Both issues have been addressed by recent machine-learning literature, but the behaviors of existing solutions in real-world domains are still far from satisfactory. Here, we describe our own technique and report experiments with a concrete text database.

AB - Automated classification of text documents has two distinctive aspects. First, each training or testing example can be labeled with more than two classes at the same time-this has serious consequences not only for the induction algorithms, but also for how we evaluate the performance of the induced classifier. Second, the examples are usually described by great many attributes, which makes induction from hundreds of thousands of training examples prohibitively expensive. Both issues have been addressed by recent machine-learning literature, but the behaviors of existing solutions in real-world domains are still far from satisfactory. Here, we describe our own technique and report experiments with a concrete text database.

KW - Classifier induction

KW - Dempster-Shafer theory

KW - Information fusion

KW - Multi-label examples

KW - Text classification

UR - http://www.scopus.com/inward/record.url?scp=74049099037&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=74049099037&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-05179-1_11

DO - 10.1007/978-3-642-05179-1_11

M3 - Chapter

SN - 9783642051784

VL - 263

T3 - Studies in Computational Intelligence

SP - 225

EP - 244

BT - Studies in Computational Intelligence

ER -