Pacchiano, Dann, Gentile, Bartlett, 2020. Regret Bound Balancing and Elimination for Model Selection in Bandits and RL.