Adroit hand dense reward fixes #220

7oponaut · 2024-07-04T20:47:46Z

fix hand-to-object dense reward component sign in adroit hand door, hammer, and relocate scenarios

Description

This PR fixes the hand-to-object dense reward component signs in the adroit hand door, hammer, and relocate scenarios.

The bugs cause the agent to maximize distance between the hand and the object of interest.

The bugs were introduced in 7b5aa90 with a refactor of the relevant code sections. I ran trainings before and after the fix to confirm that it works.

As far as I can tell the pen scenario is not affected, so no changes there.

I couldn't find the relevant CONTRIBUTING.md file to set stuff up for precommit and unit testing.

Type of change

Bug fix (non-breaking change which fixes an issue)

Checklist:

I have run the pre-commit checks with pre-commit run --all-files (see CONTRIBUTING.md instructions to set it up)
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

* fix hand-to-object dense reward component sign in adroit hand door, hammer, and relocate scenarios

jjshoots

Good catch, no clue how those sign changes got in and not sure how it survived this long. LTGM!

Kallinteris-Andreas · 2024-07-08T08:40:59Z

@7oponaut We use the CONTRIBUTING.md, from gymnasium

this change (reward function adjustment) requires a new version to be made (''AdroitHand*-v2''), this can be done by creating new files with the fixes and registering them in

Gymnasium-Robotics/gymnasium_robotics/__init__.py

Line 1200 in 0a213bb

for reward_type in ["sparse", "dense"]:

,
for reference, check how the versions are handled for other environments

7oponaut · 2024-07-08T12:28:10Z

@Kallinteris-Andreas Taking a closer look, 7b5aa90 changed reward behavior in the sparse reward setting in addition to the accidental change that causes issues. It also added termination conditions, this affects both the dense and sparse cases. The version numbers weren't bumped to "AdroitHand*-v2" though.

Which is to be considered v1, 7b5aa90 or the version before that?

I will take a look at CONTRIBUTING.md from gymnasium, thank you.

Kallinteris-Andreas · 2024-07-08T14:54:27Z

@7oponaut you are correct, the reward function was changed then without a version bump, which is a mistake, this should at least be documented (we likely remove those environment versions later because of that)
for reference, the Adroid*-v1 env was added in gymnasium_robotics==1.2.0 and the reward function was changed in gymnasium_robotics==1.2.1

regardless, the change proposed here requires a version bump

Adroit hand dense reward fixes

4bcb801

* fix hand-to-object dense reward component sign in adroit hand door, hammer, and relocate scenarios

Kallinteris-Andreas requested a review from jjshoots July 6, 2024 07:40

jjshoots approved these changes Jul 6, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adroit hand dense reward fixes #220

Adroit hand dense reward fixes #220

7oponaut commented Jul 4, 2024 •

edited

Loading

jjshoots left a comment

Kallinteris-Andreas commented Jul 8, 2024 •

edited

Loading

7oponaut commented Jul 8, 2024 •

edited

Loading

Kallinteris-Andreas commented Jul 8, 2024

Adroit hand dense reward fixes #220

Are you sure you want to change the base?

Adroit hand dense reward fixes #220

Conversation

7oponaut commented Jul 4, 2024 • edited Loading

Description

Type of change

Checklist:

jjshoots left a comment

Choose a reason for hiding this comment

Kallinteris-Andreas commented Jul 8, 2024 • edited Loading

7oponaut commented Jul 8, 2024 • edited Loading

Kallinteris-Andreas commented Jul 8, 2024

7oponaut commented Jul 4, 2024 •

edited

Loading

Kallinteris-Andreas commented Jul 8, 2024 •

edited

Loading

7oponaut commented Jul 8, 2024 •

edited

Loading