Copeland Dueling Bandit Problem: Regret Lower Bound, Optimal Algorithm, and Computationally Efficient Algorithm

Published in International Conference on Machine Learning (ICML), 2016

This paper provides comprehensive analysis of the Copeland dueling bandit problem. We establish regret lower bounds, develop optimal algorithms, and provide computationally efficient implementations for preference-based learning scenarios.

Download paper here

Recommended citation: Komiyama, J., Honda, J., & Nakagawa, H. (2016). “Copeland Dueling Bandit Problem: Regret Lower Bound, Optimal Algorithm, and Computationally Efficient Algorithm.” In Proceedings of the 33rd International Conference on Machine Learning (ICML 2016), 1235-1244.

Recommended citation: Komiyama, J., Honda, J., & Nakagawa, H. (2016). "Copeland Dueling Bandit Problem: Regret Lower Bound, Optimal Algorithm, and Computationally Efficient Algorithm." In Proceedings of the 33rd International Conference on Machine Learning (ICML 2016), 1235-1244.
Download Paper