The multiple pursuers and evaders game may be represented as a Markov game. Using this modeling, one may interpret each player as a decentralized unit that has to work independently in order to complete a task. This is a distributed multiagent decision problem and several different possible solutions have already been proposed. However, most solutions require some sort of central coordination. In this paper, we intend to model each player as a learning automaton and let them evolve and adapt in order to solve the difficult problem they have at hand. We are also going to show that, using the proposed learning process, the players' policies will converge to an equilibrium point. Simulations of such scenarios with multiple pursuers and evaders are presented in order to show the feasibility of the approach.

Additional Metadata
Keywords intelligent systems, Learning, learning automata, pursuer-evader games, reinforcement learning
Persistent URL dx.doi.org/10.1177/1059712314526261
Journal Adaptive Behavior
Citation
Givigi Jr., S.N. (Sidney N.), & Schwartz, H.M. (2014). Decentralized strategy selection with learning automata for multiple pursuer-evader games. Adaptive Behavior, 22(4), 221–234. doi:10.1177/1059712314526261