We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi all,
Can't get the same results twice with TD3 but it works with PPO2.
import gym from stable_baselines import TD3 from stable_baselines.td3.policies import MlpPolicy env = gym.make('Pendulum-v0') # env.seed(12345) not working # env.reset() model = TD3(MlpPolicy, env, verbose = 1, seed = 12345, n_cpu_tf_sess = 1) model.learn(total_timesteps = 50000, log_interval = 10) model.save("td3_pendulum")
System Info Conda env, Python 3.6.9, TF 1.14 CPU from conda-forge, SB 2.10.0 from pip
Additional context I need to study parameters effect on my custom env, but I'm not able to remove the randomness so far. Any idea ?
Thibaut
The text was updated successfully, but these errors were encountered:
Hello,
As mentioned in the documentation (and in the PR #492), TD3 sometimes fail to be deterministic for obscure reasons.
Apparently, the tensorflow version affects the results. I could get deterministic results with TF 1.8.0 on cpu (but not with higher versions...).
PS: I could not find the other duplicated issue yet, but I'm pretty sure there was one...
Sorry, something went wrong.
Couldn't find that in the TD3 doc but ok I'll try with TF 1.8 ! Thanks.
See the note here
Ok thanks again! Indeed it works with TF 1.8.0 I'm closing the issue then!
No branches or pull requests
Hi all,
Can't get the same results twice with TD3 but it works with PPO2.
System Info
Conda env, Python 3.6.9, TF 1.14 CPU from conda-forge, SB 2.10.0 from pip
Additional context
I need to study parameters effect on my custom env, but I'm not able to remove the randomness so far. Any idea ?
Thibaut
The text was updated successfully, but these errors were encountered: