Regret Lower Bound and Optimal Algorithm in Finite Stochastic Partial Monitoring
Published in Advances in Neural Information Processing Systems (NIPS), 2015
This paper establishes regret lower bounds and develops optimal algorithms for finite stochastic partial monitoring. We extend bandit theory to partial feedback settings where the learner observes only limited information about the outcomes, providing fundamental theoretical contributions to online learning.
Recommended citation: Komiyama, J., Honda, J., & Nakagawa, H. (2015). “Regret Lower Bound and Optimal Algorithm in Finite Stochastic Partial Monitoring.” In Advances in Neural Information Processing Systems 28 (NIPS 2015), 1792-1800.
Recommended citation: Komiyama, J., Honda, J., & Nakagawa, H. (2015). "Regret Lower Bound and Optimal Algorithm in Finite Stochastic Partial Monitoring." In Advances in Neural Information Processing Systems 28 (NIPS 2015), 1792-1800.
Download Paper