Bhatnagarand Abdulla. A Reinforcement Learning Based Algorithm for Finite Horizon Markov Decision Processes. IEEE, 2006, doi:10.1109/cdc.2006.377190.