Highlights of a survey paper on multiagent RL from 2008
Posts by Category
The paper is available here: Becker et al. 2004
Making experience replay work in non-stationary multi-agent settings
Deep Q-Learning with an LSTM, for partially observable MDPs
A fully differentiable planning module which can learn to plan end to end using backpropagation - NIPS 2016 Best Paper
A model-free actor-critic algorithm for continuous control that incorporates experience replay and target networks from DQN to the actor-critic approach to p...
Learning policies which adapt to the opponent’s strategy by giving information about the opponent along with the state as input to a Deep Q network