Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adroit hand dense reward fixes #220

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

7oponaut
Copy link

@7oponaut 7oponaut commented Jul 4, 2024

  • fix hand-to-object dense reward component sign in adroit hand door, hammer, and relocate scenarios

Description

This PR fixes the hand-to-object dense reward component signs in the adroit hand door, hammer, and relocate scenarios.

The bugs cause the agent to maximize distance between the hand and the object of interest.

The bugs were introduced in 7b5aa90 with a refactor of the relevant code sections. I ran trainings before and after the fix to confirm that it works.

As far as I can tell the pen scenario is not affected, so no changes there.

I couldn't find the relevant CONTRIBUTING.md file to set stuff up for precommit and unit testing.

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Checklist:

  • I have run the pre-commit checks with pre-commit run --all-files (see CONTRIBUTING.md instructions to set it up)
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

* fix hand-to-object dense reward component sign in adroit hand door, hammer, and relocate scenarios
Copy link
Member

@jjshoots jjshoots left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, no clue how those sign changes got in and not sure how it survived this long. LTGM!

@Kallinteris-Andreas
Copy link
Collaborator

Kallinteris-Andreas commented Jul 8, 2024

@7oponaut We use the CONTRIBUTING.md, from gymnasium

this change (reward function adjustment) requires a new version to be made (''AdroitHand*-v2''), this can be done by creating new files with the fixes and registering them in

for reward_type in ["sparse", "dense"]:
,
for reference, check how the versions are handled for other environments

@7oponaut
Copy link
Author

7oponaut commented Jul 8, 2024

@Kallinteris-Andreas Taking a closer look, 7b5aa90 changed reward behavior in the sparse reward setting in addition to the accidental change that causes issues. It also added termination conditions, this affects both the dense and sparse cases. The version numbers weren't bumped to "AdroitHand*-v2" though.

Which is to be considered v1, 7b5aa90 or the version before that?

I will take a look at CONTRIBUTING.md from gymnasium, thank you.

@Kallinteris-Andreas
Copy link
Collaborator

@7oponaut you are correct, the reward function was changed then without a version bump, which is a mistake, this should at least be documented (we likely remove those environment versions later because of that)
for reference, the Adroid*-v1 env was added in gymnasium_robotics==1.2.0 and the reward function was changed in gymnasium_robotics==1.2.1

regardless, the change proposed here requires a version bump

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants