Decentralized learning in multiple pursuer-evader Markov games
We represent the multiple pursuers and evaders game as a Markov game and each player as a decentralized unit that has to work independently in order to complete a task. Most proposed solutions for this distributed multiagent decision problem require some sort of central coordination. In this paper, we intend to model each player as a learning automata (LA) and let them evolve and adapt in order to solve the difficult problem they have at hand. We are also going to show that using the proposed learning process, the players' policies will converge to an equilibrium point. Simulations of such scenarios with multiple pursuers and evaders are presented in order to show the feasibility of the approach.
|2011 19th Mediterranean Conference on Control and Automation, MED 2011|
|Organisation||Department of Systems and Computer Engineering|
Givigi, S. (Sidney), & Schwartz, H.M. (2011). Decentralized learning in multiple pursuer-evader Markov games. In 2011 19th Mediterranean Conference on Control and Automation, MED 2011 (pp. 1379–1385). doi:10.1109/MED.2011.5983135