Regret Lower Bound and Optimal Algorithm in Finite Stochastic Partial Monitoring

Published in Advances in Neural Information Processing Systems (NIPS), 2015

This paper establishes regret lower bounds and develops optimal algorithms for finite stochastic partial monitoring. We extend bandit theory to partial feedback settings where the learner observes only limited information about the outcomes, providing fundamental theoretical contributions to online learning.

Download paper here

Recommended citation: Komiyama, J., Honda, J., & Nakagawa, H. (2015). “Regret Lower Bound and Optimal Algorithm in Finite Stochastic Partial Monitoring.” In Advances in Neural Information Processing Systems 28 (NIPS 2015), 1792-1800.

Recommended citation: Komiyama, J., Honda, J., & Nakagawa, H. (2015). "Regret Lower Bound and Optimal Algorithm in Finite Stochastic Partial Monitoring." In Advances in Neural Information Processing Systems 28 (NIPS 2015), 1792-1800.
Download Paper