Using the theory of regular functions to formally prove the ε-optimality of Discretized pursuit learning algorithms
Learning Automata (LA) can be reckoned to be the founding algorithms on which the field of Reinforcement Learning has been built. Among the families of LA, Estimator Algorithms (EAs) are certainly the fastest, and of these, the family of Pursuit Algorithms (PAs) are the pioneering work. It has recently been reported that the previous proofs for ε-optimality for all the reported algorithms in the family of PAs have been flawed. We applaud the researchers who discovered this flaw, and who further proceeded to rectify the proof for the Continuous Pursuit Algorithm (CPA). The latter proof, though requires the learning parameter to be continuously changing, is, to the best of our knowledge, the current best and only way to prove CPA's ε-optimality. However, for all the algorithms with absorbing states, for example, the Absorbing Continuous Pursuit Algorithm (ACPA) and the Discretized Pursuit Algorithm (DPA), the constrain of a continuously changing learning parameter can be removed. In this paper, we provide a new method to prove the ε-optimality of the Discretized Pursuit Algorithm which does not require this constraint. We believe that our proof is both unique and pioneering. It can also form the basis for formally showing the ε-optimality of the other EAs with absorbing states.
|Keywords||Discretized Pursuit Algorithm, Pursuit Algorithms, ε-optimality|
|Series||Lecture Notes in Computer Science|
Zhang, X. (Xuan), Oommen, J, Granmo, O.-C. (Ole-Christoffer), & Jiao, L. (Lei). (2014). Using the theory of regular functions to formally prove the ε-optimality of Discretized pursuit learning algorithms. In Lecture Notes in Computer Science. doi:10.1007/978-3-319-07455-9_40