Bandit problems with infinitely many arms

Donald A. Berry, Robert W. Chen, Alan Zame, David C. Heath, Larry A. Shepp

Research output: Contribution to journalArticlepeer-review

40 Scopus citations


We consider a bandit problem consisting of a sequence of n choices from an infinite number of Bernoulli arms, with n → ∝. The objective is to minimize the long-run failure rate. The Bernoulli parameters are independent observations from a distribution F. We first assume F to be the uniform distribution on (0, 1) and consider various extensions. In the uniform case we show that the best lower bound for the expected failure proportion is between √2/√n and 2/√n and we exhibit classes of strategies that achieve the latter.

Original languageEnglish (US)
Pages (from-to)2103-2116
Number of pages14
JournalAnnals of Statistics
Issue number5
StatePublished - Oct 1997


  • Bandit problems
  • Dynamic allocation of bernoulli processes
  • Sequential experimentation
  • Staying with a winner
  • Switching with a loser

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty


Dive into the research topics of 'Bandit problems with infinitely many arms'. Together they form a unique fingerprint.

Cite this