Semisupervised learning from different information sources

Research output: Contribution to journalArticle

24 Citations (Scopus)

Abstract

This paper studies the use of a semisupervised learning algorithm from different information sources. We first offer a theoretical explanation as to why minimising the disagreement between individual models could lead to the performance improvement. Based on the observation, this paper proposes a semisupervised learning approach that attempts to minimise this disagreement by employing a co-updating method and making use of both labeled and unlabeled data. Three experiments to test the effectiveness of the approach are presented in this paper: (i) webpage classification from both content and hyperlinks; (ii) functional classification of gene using gene expression data and phylogenetic data and (iii) machine self-maintaining from both sensory and image data. The results show the effectiveness and efficiency of our approach and suggest its application potentials. Springer-Verlag London Ltd.

Original languageEnglish (US)
Pages (from-to)289-309
Number of pages21
JournalKnowledge and Information Systems
Volume7
Issue number3
DOIs
StatePublished - Mar 2005
Externally publishedYes

Fingerprint

Gene expression
Learning algorithms
Genes
Experiments

Keywords

  • Decision tree
  • Minimise disagreement
  • Semisupervised
  • Support vector machines
  • Unlabelled data

ASJC Scopus subject areas

  • Information Systems

Cite this

Semisupervised learning from different information sources. / Li, Tao; Ogihara, Mitsunori.

In: Knowledge and Information Systems, Vol. 7, No. 3, 03.2005, p. 289-309.

Research output: Contribution to journalArticle

@article{1ded45bb2fdc4427a8774568386be937,
title = "Semisupervised learning from different information sources",
abstract = "This paper studies the use of a semisupervised learning algorithm from different information sources. We first offer a theoretical explanation as to why minimising the disagreement between individual models could lead to the performance improvement. Based on the observation, this paper proposes a semisupervised learning approach that attempts to minimise this disagreement by employing a co-updating method and making use of both labeled and unlabeled data. Three experiments to test the effectiveness of the approach are presented in this paper: (i) webpage classification from both content and hyperlinks; (ii) functional classification of gene using gene expression data and phylogenetic data and (iii) machine self-maintaining from both sensory and image data. The results show the effectiveness and efficiency of our approach and suggest its application potentials. Springer-Verlag London Ltd.",
keywords = "Decision tree, Minimise disagreement, Semisupervised, Support vector machines, Unlabelled data",
author = "Tao Li and Mitsunori Ogihara",
year = "2005",
month = "3",
doi = "10.1007/s10115-004-0155-8",
language = "English (US)",
volume = "7",
pages = "289--309",
journal = "Knowledge and Information Systems",
issn = "0219-1377",
publisher = "Springer London",
number = "3",

}

TY - JOUR

T1 - Semisupervised learning from different information sources

AU - Li, Tao

AU - Ogihara, Mitsunori

PY - 2005/3

Y1 - 2005/3

N2 - This paper studies the use of a semisupervised learning algorithm from different information sources. We first offer a theoretical explanation as to why minimising the disagreement between individual models could lead to the performance improvement. Based on the observation, this paper proposes a semisupervised learning approach that attempts to minimise this disagreement by employing a co-updating method and making use of both labeled and unlabeled data. Three experiments to test the effectiveness of the approach are presented in this paper: (i) webpage classification from both content and hyperlinks; (ii) functional classification of gene using gene expression data and phylogenetic data and (iii) machine self-maintaining from both sensory and image data. The results show the effectiveness and efficiency of our approach and suggest its application potentials. Springer-Verlag London Ltd.

AB - This paper studies the use of a semisupervised learning algorithm from different information sources. We first offer a theoretical explanation as to why minimising the disagreement between individual models could lead to the performance improvement. Based on the observation, this paper proposes a semisupervised learning approach that attempts to minimise this disagreement by employing a co-updating method and making use of both labeled and unlabeled data. Three experiments to test the effectiveness of the approach are presented in this paper: (i) webpage classification from both content and hyperlinks; (ii) functional classification of gene using gene expression data and phylogenetic data and (iii) machine self-maintaining from both sensory and image data. The results show the effectiveness and efficiency of our approach and suggest its application potentials. Springer-Verlag London Ltd.

KW - Decision tree

KW - Minimise disagreement

KW - Semisupervised

KW - Support vector machines

KW - Unlabelled data

UR - http://www.scopus.com/inward/record.url?scp=14844303546&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=14844303546&partnerID=8YFLogxK

U2 - 10.1007/s10115-004-0155-8

DO - 10.1007/s10115-004-0155-8

M3 - Article

VL - 7

SP - 289

EP - 309

JO - Knowledge and Information Systems

JF - Knowledge and Information Systems

SN - 0219-1377

IS - 3

ER -