Journal of Applied Mathematics and Stochastic Analysis 
Volume 12 (1999), Issue 2, Pages 151-160
doi:10.1155/S1048953399000155

Exact solution of the Bellman equation for a β-discounted reward in a two-armed bandit with switching arms

Doncho S. Donchev

Higher Institute of Food and Flavor Industries, 26, Maritza str., Plovdiv 4002, Bulgaria

Received 1 May 1998; Revised 1 October 1998

Abstract

We consider the symmetric Poissonian two-armed bandit problem. For the case of switching arms, only one of which creates reward, we solve explicitly the Bellman equation for a β-discounted reward and prove that a myopic policy is optimal.