Schneckenreither. Average Reward Adjusted Discounted Reinforcement Learning: Near-blackwell-optimal Policies for Real-world Applications. 2 Apr. 2020.