-
-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Maze/AntMaze issues #156
Maze/AntMaze issues #156
Conversation
@alexdavey. We should be keeping previous versions of the environment such as in Gymnasium. You can create different file versions for each environment version |
@alexdavey I think the test is failing because the naming Actually, I would prefer if the test files have the same structure as the Gymnasium repo. Add two files |
The reward is now calculated before the goal is updated in compute_terminated(), as in point_maze. Important for e.g. sparse rewards.
Reset/goal distance threshold now scales with maze size. Prevents AntMaze resetting into a goal position due to added noise. Added test that now passes.
|
||
return obs_dict, info | ||
|
||
def step(self, action): | ||
ant_obs, _, _, _, info = self.ant_env.step(action) | ||
obs = self._get_obs(ant_obs) | ||
|
||
info["success"] = bool(np.linalg.norm(obs["achieved_goal"] - self.goal) <= 0.45) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
info
should be after truncated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the order doesn't matter. Correct me if I'm missing anything @Kallinteris-Andreas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not affect the behavior, but makes it more readable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. @alexdavey if you can update it as per @Kallinteris-Andreas request that'll be great :)
gymnasium_robotics/envs/maze/maze.py
Outdated
# Generate another goal | ||
goal = self.generate_target_goal() | ||
# Add noise to goal position | ||
self.goal = self.add_xy_position_noise(goal) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to test if the object I near the goal (0.45)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. @alexdavey can you add a while loop to keep generating a goal
until the distance to achieved_goal
is greater than the 0.45
threshold?
@rodrigodelazcano. I would have thought that the Edit: I have listed what seem like the most sensible version changes to me, below. Let me know if you are happy with these and I will update the PR accordingly:
|
My take on this is that the
Sounds good. My take on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the version changes. I also added some minor comments on @Kallinteris-Andreas review and other updates for the PR., Thank you @alexdavey
In AntMaze, the MazeEnv parent is initalised with Thanks to both of you for the reviews. |
You are totally right! Then I take my words back, bumping makes more sense |
* restored ant_maze.py from pre-PR state * restored maze.py from pre-PR state * add test for info['success'] = false on reset * add distance check when generating new goal in update_goal()
gymnasium_robotics/envs/maze/maze.py
Outdated
@@ -40,6 +40,9 @@ class Maze: | |||
The Maze class also presents a method to convert from cell indices to `(x,y)` coordinates in the MuJoCo simulation: | |||
- :meth:`cell_rowcol_to_xy` - Convert from `(i,j)` to `(x,y)` | |||
|
|||
### Version History |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexdavey Can you remove the docstrings for the versions in ant_maze.py
and maze.py
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove the ones I added to maze.py
. I think ant_maze.py
has no changes currently from the existing version -- did you want me to remove them anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a standard we only keep docstrings in the latest file version. That'll be ant_maze_v4.py
and maze_v4.py
. So yes remove them for both maze,.py
and ant_maze.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, will do now
Thanks a lot @alexdavey . I'll merge this after the CI checks |
I just removed some final docstrings in |
Description
Fixes #155
refactored
compute_terminated
so it is now a pure function, addedupdate_goal
added info["success"] key to AntMaze
fixed sparse reward issue in AntMaze
fixed reset issue in Maze
added test for reset issue
bumped version to AntMaze-v4
Type of change
Checklist:
pre-commit
checks withpre-commit run --all-files
(seeCONTRIBUTING.md
instructions to set it up)