Introduction to Machine Learning Project 6

Reinforcement learning: Value Iteration, SARSA, and Q-learning on the racetrack problem

Train a policy

To train a Q-learning policy:

python exp_Q.py

To train a SARSA policy:

python exp_SARSA.py

To train a value iteration policy:

python exp_VI.py

This results in an arrays directory which contains the policy at various points during training as well as a

Race with a policy

Once a policy is trained, it can be used in a race.

python race.py

This produces a race like so:

Results

Plot of learning curves for Q-Learning on different tracks and with different crash behavior. A "normal crash" means that if the car crashes into a wall, it returns to the last valid track square, whereas a "bad crash" means that it returns to the starting line upon crashing.

Example race:

Name	Name	Last commit message	Last commit date
Latest commit skycarl Documentation Jan 28, 2022 6b4b901 · Jan 28, 2022 History 61 Commits
data	data	Add initial files	Nov 22, 2020
img	img	Documentation	Jan 28, 2022
src	src	Bad indent	Dec 13, 2020
.gitignore	.gitignore	Initial commit	Nov 22, 2020
README.md	README.md	Documentation	Jan 28, 2022
exp_Q.py	exp_Q.py	Update	Dec 13, 2020
exp_SARSA.py	exp_SARSA.py	Update defaults	Dec 13, 2020
exp_VI.py	exp_VI.py	Add experiment scripts	Dec 13, 2020
race.py	race.py	Update	Dec 13, 2020
requirements.txt	requirements.txt	Update numpy	Jan 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction to Machine Learning Project 6

Train a policy

Race with a policy

Results

About

Releases

Packages

Languages

skycarl/ml_proj_6

Folders and files

Latest commit

History

Repository files navigation

Introduction to Machine Learning Project 6

Train a policy

Race with a policy

Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages