-
-
Notifications
You must be signed in to change notification settings - Fork 820
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
StickyAction wrapper can repeat the old action for more than 1 step #1240
Conversation
Optional argument that allows to repeat the old action for more than 1 step.
test for repeated action for n steps
@sparisi If I understand correctly, this will limit the number of repeat actions that can be taken. |
I wouldn't say it "limits" the repeats. There is no limit to the repeats per episode. Example.
The original behavior (n = 1) is still possible. If n = 1 and the repeat is triggered, then |
Ahh I understand better now thanks for the description ALE implements two options for frameskip of a deterministic, v5 with 4 frames and v0 randomly between 2 and 5. This PR is added frameskip for a random value up to X which is equivalent to the v0 style randomly between 1 and X. |
@pseudo-rnd-thoughts "This PR is added frameskip for a random value up to X" Now I have added the possibility to have stochastic repeats within a range. When the agent starts a series of repeats, the duration is randomly determined within the range passed as argument to the wrapper. |
@sparisi I came back the PR after a couple of days and hopefully better understand what you are talking about with the difference with this vs frameskip. |
@pseudo-rnd-thoughts |
The StickyAction wrapper now takes an optional argument to allow the old action to be repeated for more than 1 steps (default is 1). The original behavior is unchanged.
Description
No fix, this is an extra feature. It increases the difficulty of sticky actions and the "non-Markovianity" of the environment (the more steps the action is repeated, the more in the past the agent should look to predict the next state). It can be useful for RL of non-Markov decision processes.
Type of change
Please delete options that are not relevant.
Checklist:
pre-commit
checks withpre-commit run --all-files
(seeCONTRIBUTING.md
instructions to set it up)