The action-mapping in DeepSea-bsuite does not behave like the original DeepSea environment #77

Pascal314 · 2024-05-30T15:37:46Z

In BSuite, DeepSea generates a random action mapping and keeps this action mapping fixed during resets. The main purpose of a random action mapping is to make sure that a DQN agent can not trivially solve the environment just by having a bias towards the action "right".

Currently, Gymnax's DeepSea-bsuite implementation either:

Uses a deterministic action mapping if deterministic is True.
Randomly generates an action mapping for every reset if both sample_action_map is Trueand deterministic is False.
Uses a default action map, which is set to be a deterministic one, when sample_action_map is False and deterministic is False.

This poses a few problems:

It is not possible to use a random mapping without making the transitions stochastic.
Getting the default behaviour of BSuite, i.e. a fixed random mapping, requires workarounds such as generating the mapping by hand and setting env.action_mapping, or resetting the environment with a fixed key, which is not ideal for general agent-environment loops.

I think problem 1 is just a bug: the deterministic environment parameter should probably just discern between BSuite's "DeepSea" and "DeepSea Stochastic" environment.

Problem 2 could perhaps be fixed by changing the default env.action_mapping, or adding the action_mapping to env_state.

Finally, the randomize_actions environment parameter is currently unused, and it is unclear to me why the option of sample_action_map exists. Surely randomly generating the action mapping at the start of every episode makes the problem completely impossible to solve?

The text was updated successfully, but these errors were encountered:

Pascal314 mentioned this issue Jul 4, 2024

changed the action mappings to match bsuite's implementation #81

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The action-mapping in DeepSea-bsuite does not behave like the original DeepSea environment #77

The action-mapping in DeepSea-bsuite does not behave like the original DeepSea environment #77

Pascal314 commented May 30, 2024

The action-mapping in DeepSea-bsuite does not behave like the original DeepSea environment #77

The action-mapping in DeepSea-bsuite does not behave like the original DeepSea environment #77

Comments

Pascal314 commented May 30, 2024