2013-12-01
Exponential moving average Q-learning algorithm
Publication
Publication
Presented at the
2013 4th IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2013 (April 2013), Singapore
A multi-agent policy iteration learning algorithm is proposed in this work. The Exponential Moving Average (EMA) mechanism is used to update the policy for a Q-learning agent so that it converges to an optimal policy against the policies of the other agents. The proposed EMA Q-learning algorithm is examined on a variety of matrix and stochastic games. Simulation results show that the proposed algorithm converges in a wider variety of situations than state-of-the-art multi-agent reinforcement learning (MARL) algorithms.
Additional Metadata | |
---|---|
doi.org/10.1109/ADPRL.2013.6614986 | |
2013 4th IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2013 | |
Organisation | Department of Systems and Computer Engineering |
Awheda, M.D. (Mostafa D.), & Schwartz, H.M. (2013). Exponential moving average Q-learning algorithm. In IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL (pp. 31–38). doi:10.1109/ADPRL.2013.6614986
|