Induction from multi-label examples

Hind Hazza Alsharif, Wadee Saleh Alhalabi, Miroslav Kubat

Research output: Contribution to journalArticle

Abstract

The task of text categorization is to assign one or more classes to a document. The simplest machine learning approach to such domains, simply induces a binary classifier separately for each class, and then uses these classifiers in parallel. An example of motivating application is a digital library collection that used to be classified into classes and sub-classes in a hierarchical order. Another important issue that we are considering is the document might belong to more than one class, in this case we will be working on a high performance multi-class label classifier. The study we are intending to do herein is going to show how much we can gain from machine learning. This mean, if we need something like 10 to 15% of the data for training, and testing or do we need > 50% of the data set for training and testing. In the latter case, the machine learning may don't contribute that much. However, if 10 to 15% of the data set is needed, then, machine learning has a great contribution. The last issue we are working on in this research is the inter-class relation. Which means, if the example is classified to belong to a class C, does this mean, the example belong to parents and grandparents classes of the class C, and on the opposite way too? We will use a framework to classify documents automatically and this can indeed answer these questions.

Original languageEnglish
Article number67
Pages (from-to)495-511
Number of pages17
JournalLife Science Journal
Volume11
Issue number10
StatePublished - Jan 1 2014

Fingerprint

Learning systems
Labels
Classifiers
Digital Libraries
Digital libraries
Testing
Parents
Machine Learning
Research
Datasets

Keywords

  • Induction process
  • Inter-class relation
  • KNN algorithms
  • Multi-label classifiers
  • Naïve Bays algorithms
  • Text categorization

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)

Cite this

Alsharif, H. H., Alhalabi, W. S., & Kubat, M. (2014). Induction from multi-label examples. Life Science Journal, 11(10), 495-511. [67].

Induction from multi-label examples. / Alsharif, Hind Hazza; Alhalabi, Wadee Saleh; Kubat, Miroslav.

In: Life Science Journal, Vol. 11, No. 10, 67, 01.01.2014, p. 495-511.

Research output: Contribution to journalArticle

Alsharif, HH, Alhalabi, WS & Kubat, M 2014, 'Induction from multi-label examples', Life Science Journal, vol. 11, no. 10, 67, pp. 495-511.
Alsharif HH, Alhalabi WS, Kubat M. Induction from multi-label examples. Life Science Journal. 2014 Jan 1;11(10):495-511. 67.
Alsharif, Hind Hazza ; Alhalabi, Wadee Saleh ; Kubat, Miroslav. / Induction from multi-label examples. In: Life Science Journal. 2014 ; Vol. 11, No. 10. pp. 495-511.
@article{01c97ad945054941b02b478a3c136d33,
title = "Induction from multi-label examples",
abstract = "The task of text categorization is to assign one or more classes to a document. The simplest machine learning approach to such domains, simply induces a binary classifier separately for each class, and then uses these classifiers in parallel. An example of motivating application is a digital library collection that used to be classified into classes and sub-classes in a hierarchical order. Another important issue that we are considering is the document might belong to more than one class, in this case we will be working on a high performance multi-class label classifier. The study we are intending to do herein is going to show how much we can gain from machine learning. This mean, if we need something like 10 to 15{\%} of the data for training, and testing or do we need > 50{\%} of the data set for training and testing. In the latter case, the machine learning may don't contribute that much. However, if 10 to 15{\%} of the data set is needed, then, machine learning has a great contribution. The last issue we are working on in this research is the inter-class relation. Which means, if the example is classified to belong to a class C, does this mean, the example belong to parents and grandparents classes of the class C, and on the opposite way too? We will use a framework to classify documents automatically and this can indeed answer these questions.",
keywords = "Induction process, Inter-class relation, KNN algorithms, Multi-label classifiers, Na{\"i}ve Bays algorithms, Text categorization",
author = "Alsharif, {Hind Hazza} and Alhalabi, {Wadee Saleh} and Miroslav Kubat",
year = "2014",
month = "1",
day = "1",
language = "English",
volume = "11",
pages = "495--511",
journal = "Life Science Journal",
issn = "1097-8135",
publisher = "Zhengzhou University",
number = "10",

}

TY - JOUR

T1 - Induction from multi-label examples

AU - Alsharif, Hind Hazza

AU - Alhalabi, Wadee Saleh

AU - Kubat, Miroslav

PY - 2014/1/1

Y1 - 2014/1/1

N2 - The task of text categorization is to assign one or more classes to a document. The simplest machine learning approach to such domains, simply induces a binary classifier separately for each class, and then uses these classifiers in parallel. An example of motivating application is a digital library collection that used to be classified into classes and sub-classes in a hierarchical order. Another important issue that we are considering is the document might belong to more than one class, in this case we will be working on a high performance multi-class label classifier. The study we are intending to do herein is going to show how much we can gain from machine learning. This mean, if we need something like 10 to 15% of the data for training, and testing or do we need > 50% of the data set for training and testing. In the latter case, the machine learning may don't contribute that much. However, if 10 to 15% of the data set is needed, then, machine learning has a great contribution. The last issue we are working on in this research is the inter-class relation. Which means, if the example is classified to belong to a class C, does this mean, the example belong to parents and grandparents classes of the class C, and on the opposite way too? We will use a framework to classify documents automatically and this can indeed answer these questions.

AB - The task of text categorization is to assign one or more classes to a document. The simplest machine learning approach to such domains, simply induces a binary classifier separately for each class, and then uses these classifiers in parallel. An example of motivating application is a digital library collection that used to be classified into classes and sub-classes in a hierarchical order. Another important issue that we are considering is the document might belong to more than one class, in this case we will be working on a high performance multi-class label classifier. The study we are intending to do herein is going to show how much we can gain from machine learning. This mean, if we need something like 10 to 15% of the data for training, and testing or do we need > 50% of the data set for training and testing. In the latter case, the machine learning may don't contribute that much. However, if 10 to 15% of the data set is needed, then, machine learning has a great contribution. The last issue we are working on in this research is the inter-class relation. Which means, if the example is classified to belong to a class C, does this mean, the example belong to parents and grandparents classes of the class C, and on the opposite way too? We will use a framework to classify documents automatically and this can indeed answer these questions.

KW - Induction process

KW - Inter-class relation

KW - KNN algorithms

KW - Multi-label classifiers

KW - Naïve Bays algorithms

KW - Text categorization

UR - http://www.scopus.com/inward/record.url?scp=84903215876&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84903215876&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:84903215876

VL - 11

SP - 495

EP - 511

JO - Life Science Journal

JF - Life Science Journal

SN - 1097-8135

IS - 10

M1 - 67

ER -