Skip to content

Latest commit

 

History

History
40 lines (28 loc) · 1.06 KB

File metadata and controls

40 lines (28 loc) · 1.06 KB

Project report

Learning algorithm

The learning algorithm used is vanilla Deep Q Learning as described in original paper. As an input the vector of state is used instead of an image so convolutional neural nework is replaced with deep neural network. The deep neural network has following layers:

  • Fully connected layer - input: 37 (state size) output: 128
  • Fully connected layer - input: 128 output 64
  • Fully connected layer - input: 64 output: (action size)

Parameters used in DQN algorithm:

  • Maximum steps per episode: 1000
  • Starting epsilion: 1.0
  • Ending epsilion: 0.01
  • Epsilion decay rate: 0.95

Results

results

Episode 100	Average Score: 1.66
Episode 200	Average Score: 8.17
Episode 300	Average Score: 10.81
Episode 364	Average Score: 13.00
Environment solved in 264 episodes!	Average Score: 13.00

### Trained agent

![trained](images/trained.gif)

## Future work

1. Extensive hyperparameter optimization
2. Double Deep Q Networks
3. Prioritized Experience Replay
4. Dueling Deep Q Networks
5. RAINBOW Paper
6. Learning from pixels