Skip to content

A reinforcement learning agent (AI) that solves randomized mazes.

Notifications You must be signed in to change notification settings

Seth1Briney/maze-runner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Author: Seth Briney

Acknowledgments: Special thanks to Richard S. Sutton, Andrew G. Barto, Adam White, Martha White, and Richard Weiss

Please feel free to email any questions about this program or reinforcement learning in general to [email protected].

This is a slight revision of a maze-running reinforcement learning AI I wrote while taking some undergrad classes at The Evergreen State College, in preparation for my Master of Computer Science degree.

It produces a graphical output of the Q values of the maze grid. Here's an example for default parameters: Alt text

Assignment: Write policy iteration type algorithm using numpy arrays.

The following program utilizes tabular Q-learning to train a maze-runner reinforcement learning control agent.

prob_prop_to_value_episode makes decisions proportional to the running estimate of the state value to transition to.

epsilon_greedy_episode is as described, except for the following:

both prob_prop_to_value_episode and epsilon_greedy_episode have an offset to discourage backtracking, that falls off by exactly 1/(steps-last_at_states[y,x]) where y,x is the state in consideration.

The idea here is that an agent (represented by 9) will move through this array (board) until it reaches the terminal state -1.

A state is a pair of indeces n,m which represents a place on the board.

Rules:

The agent can only occupy states with 0, The agent can make jump to any 0 entry within the 3x3 sub-array centered at the current state. The terminal state is -1, and an episode will end once the terminal state is contained in the 3x3 sub-array centered at the state.

I will use "the surroundings" to refer to "3x3 sub-array centered at the state".

About

A reinforcement learning agent (AI) that solves randomized mazes.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages