The fastest learning automata (LA) algorithms currently available fall in the family of estimator algorithms introduced by Thathachar and Sastry 24. The pioneering work of these authors was the pursuit algorithm, which pursues only the current estimated optimal action. If this action is not the one with the minimum penalty probability, this algorithm pursues a wrong action. In this paper, we argue that a pursuit scheme that generalizes the traditional pursuit algorithm by pursuing all the actions with higher reward estimates than the chosen action, minimizes the probability of pursuing a wrong action, and is a faster converging scheme. To attest this, we present two new generalized pursuit algorithms (GPAs) and also present a quantitative comparison of their performanee against the existing pursuit algorithms. Empirically, the algorithms proposed here are among the fastest reported LA to date.

Estimator algorithms, Learning automata (LA), Pursuit algorithms
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
School of Computer Science

Agache, M. (Mariana), & Oommen, J. (2002). Generalized pursuit learning schemes: New families of continuous and discretized learning automata. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 32(6), 738–749. doi:10.1109/TSMCB.2002.1049608