Active learning for streaming data in a contextual bandit framework

Linqi Song, Jie Xu, Congduan Li

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Contextual bandit algorithms have been shown to be effective in solving sequential decision making problems under uncertainty. A common assumption in the literature is that the realized (ground truth) reward is observed by the learner at no cost, which, however, is not realistic in many practical scenarios. When observing the ground truth reward is costly, a key challenge is how to judiciously acquire the ground truth by assessing the benefits and costs in order to balance learning efficiency and learning cost. In this paper, we design a novel contextual bandit-based learning algorithm and endow it with the active learning capability. In addition to sending a query to an annotator for the ground truth, prior information about the ground truth learned by the learner is sent together, thereby reducing the query cost. We prove that the learning regret of the proposed algorithm achieves the same order as that of conventional contextual bandit algorithms in cost-free scenarios, implying that, surprisingly, cost due to acquiring the ground truth does not increase the learning regret in the long-run, where the prior information about the ground truth plays a critical role.

Original languageEnglish (US)
Title of host publicationProceedings of the 2019 5th International Conference on Computing and Data Engineering, ICCDE 2019
PublisherAssociation for Computing Machinery
Pages29-35
Number of pages7
ISBN (Electronic)9781450361248
DOIs
StatePublished - May 4 2019
Event5th International Conference on Computing and Data Engineering, ICCDE 2019 - Shanghai, China
Duration: May 4 2019May 6 2019

Publication series

NameACM International Conference Proceeding Series

Conference

Conference5th International Conference on Computing and Data Engineering, ICCDE 2019
CountryChina
CityShanghai
Period5/4/195/6/19

Keywords

  • Active learning
  • Contextual bandits
  • Streaming data

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Human-Computer Interaction
  • Software

Fingerprint Dive into the research topics of 'Active learning for streaming data in a contextual bandit framework'. Together they form a unique fingerprint.

  • Cite this

    Song, L., Xu, J., & Li, C. (2019). Active learning for streaming data in a contextual bandit framework. In Proceedings of the 2019 5th International Conference on Computing and Data Engineering, ICCDE 2019 (pp. 29-35). (ACM International Conference Proceeding Series). Association for Computing Machinery. https://doi.org/10.1145/3330530.3330543