Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Determinism #492

Merged
merged 22 commits into from
Oct 11, 2019
Merged

Fix Determinism #492

merged 22 commits into from
Oct 11, 2019

Conversation

araffin
Copy link
Collaborator

@araffin araffin commented Sep 29, 2019

closes #145
closes #285
Related #485

Important Note: In order to fix determinism, I had to break the API of learn() method: I moved the seed argument to the constructor.
I also had to separate the determinism test to avoid side effects from other tests

@araffin araffin added this to the v2.9.0 milestone Sep 29, 2019
@araffin araffin changed the title Deterministic fix [WIP] Fix Determinism Oct 3, 2019
@araffin araffin marked this pull request as ready for review October 3, 2019 16:34
Copy link
Collaborator

@Miffyli Miffyli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • seeds default to zero now. IMHO it should default to None, in which case we do not set the seed to anything fixed. This would match the previous behavior and the behavior of other derp learning frameworks.
  • self.seed should be stored in the save file. Looks like just adding it to the list of stored variables should do the trick.
  • TD3 test is failing for me with current version of test_0discrete.py (Non-deterministic results. Only using CPU. Ubuntu 18.04, Python 3.6. Ran with pytest tests/test_0deterministic.py).
  • test_0discrete.py should skip tests if GPU is used (GPU adds randomness which is not fixed yet. Running test with GPU results to fails in many algorithms).
  • Nitpick: What does 0deterministic stand for in test_0deterministic.py? Deterministic with seed zero? I find this bit confusing, could be just test_deterministic.py as before. Edit: Or is this to make "test separate from other tests"?

Sidenote: Other than the TD3, other algorithms pass the test (ran five separate runs, plus two with SEED= just in case)

Sorry for the trailing whitespaces

@araffin
Copy link
Collaborator Author

araffin commented Oct 5, 2019

in which case we do not set the seed to anything fixed.
Done.

self.seed should be stored in the save file. Looks like just adding it to the list of stored variables should do the trick.

Good point ;) (I had that thought afterward too)

TD3 test is failing for me with current version
Sidenote: Other than the TD3, other algorithms pass the test

I think we need to investigate more, TD3 has been the one giving me the more problems. Weirdly enough, it seems to be affected by previous execution of models... (running the test with TD3 alone was different from running TD3 after DQN test)
The thing I don't really understand is that TD3 code is based on SAC but SAC does not fail... I would appreciate some help ;)

The tests pass on travis... What is your tf version? Did you try with the docker image?

should skip tests if GPU is used

Do you have a code snippet for checking that automatically?
And do you have some pointers to solve that issue properly? (I know it is possible with pytorch but I think for tf, it won't be possible before tf 2.0)

Nitpick: What does 0deterministic stand for in

yes, that's a hack for having a separate test, otherwise, it would fail...I'm open to any better solution as I don't like that one.

@Miffyli
Copy link
Collaborator

Miffyli commented Oct 5, 2019

I think we need to investigate more, TD3 has been the one giving me the more problems. Weirdly enough, it seems to be affected by previous execution of models... (running the test with TD3 alone was different from running TD3 after DQN test)

Now that you mentioned one of the runs randomly failed with both DQN and TD3 failing out of the blue :o . I can try digging in further to this, but I am currently connecting over mobile connections and SSHing is no fun, so things will be tad slow ^^.

The tests pass on travis... What is your tf version? Did you try with the docker image?

Tensorflow 1.14.0 without docker image (I test things without docker as that is where I would personally run things). This is probably the part that causes issues as stable-baselines is based on the older TF version.

Do you have a code snippet for checking that automatically?

I do not know about detection but you could possibly run test with with tf.device("cpu") block to force running things on CPU. As for fixing the issue, I think we should move that to TF2/next-backend thing, given that TF2 now is officially released.

yes, that's a hack for having a separate test, otherwise, it would fail...I'm open to any better solution as I don't like that one.

Aight :). I do not know enough of these automatic systems to be of any help.

stable_baselines/a2c/a2c.py Outdated Show resolved Hide resolved
@araffin
Copy link
Collaborator Author

araffin commented Oct 6, 2019

I did additional tests and it seems it comes from tensorflow, I could not reset completely its state and reload is not supported... so I'm a bit stuck (see issue)

@Miffyli
Copy link
Collaborator

Miffyli commented Oct 7, 2019

Sounds like something that also can be better fixed along with TF2 support? At this rate we need a detailed list of things to consider when updating to the new backend :) (I know few other suggestions apart from determinism etc).

@araffin
Copy link
Collaborator Author

araffin commented Oct 9, 2019

Sounds like something that also can be better fixed along with TF2 support?

I hope so. Referencing #366 then.

But I think it still makes sense to merge that partial fix, no? or you prefer to wait for a full patch?

@Miffyli
Copy link
Collaborator

Miffyli commented Oct 9, 2019

Oh yes, definitely should merge this one, too :). It is a clear step to right direction. I would only add a notification that TD3 might not work as expected but for other algorithms things work as intended.

Copy link
Collaborator

@Miffyli Miffyli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with the new part in documentation :)

@araffin araffin merged commit 8a8baf1 into master Oct 11, 2019
@araffin araffin deleted the deterministic-fix branch October 11, 2019 13:59
@bycn
Copy link

bycn commented Mar 6, 2020

Any updates for the TD3 failures? I finally found this thread and am experiencing the exact same issues — it seems to randomly fail AFTER a certain sequence of runs.

Also, is DDPG affected by this?

@araffin araffin mentioned this pull request May 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Does set_global seed work? [question] Outputs from runs with same random seed are not identical
3 participants