GitHub - Grzego/async-rl: Variation of "Asynchronous Methods for Deep Reinforcement Learning" with multiple processes generating experience for agent (Keras + Theano + OpenAI Gym)[1-step Q-learning, n-step Q-learning, A3C]

Variation of Asynchronous RL in Keras (Theano backend) + OpenAI gym [1-step Q-learning, n-step Q-learning, A3C]

This is a simple variation of asynchronous reinforcement learning written in Python with Keras (Theano backend). Instead of many threads training at the same time there are many processes generating experience for a single agent to learn from.

Explanation

There are many processes (tested with 4, it should work better with more in case of Q-learning methods) which are creating experience and sending it to the shared queue. Queue is limited in length (tested with 256) to stop individual processes from excessively generating experience with old weights. Learning process draws from queue samples in batches and learns on them. In A3C network weights are swapped relatively fast to keep them updated.

Currently implemented and working methods

Requirements

Python 3.4/Python 3.5
Keras
Theano (Tensorflow would probably work too)
OpenAI (atari-py)
pip3 install scikit-image h5py scipy

Sample game (A3C)

Feedback

Because I'm newbie in Reinforcement Learning and Deep Learning, feedback is very welcome :)

Note

Weights were learned in Theano, so loading them in Tensorflow may be a little problematic due to Convolutional Layers.
If training halts after few seconds, don't worry, its probably because Keras lazily compiles Theano function, it should resume quickly.
Each process sets its own compilation directory for Theano so compilation can take very long time at the beginning (can be disabled with --th_comp_fix=False)

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
a3c		a3c
q-learning-1-step		q-learning-1-step
q-learning-n-step		q-learning-n-step
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Variation of Asynchronous RL in Keras (Theano backend) + OpenAI gym [1-step Q-learning, n-step Q-learning, A3C]

Explanation

Currently implemented and working methods

Requirements

Sample game (A3C)

Feedback

Note

Useful resources

About

Releases

Packages

Contributors 2

Languages

License

Grzego/async-rl

Folders and files

Latest commit

History

Repository files navigation

Variation of Asynchronous RL in Keras (Theano backend) + OpenAI gym [1-step Q-learning, n-step Q-learning, A3C]

Explanation

Currently implemented and working methods

Requirements

Sample game (A3C)

Feedback

Note

Useful resources

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages