Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare tf2 [WIP] #580

Closed
wants to merge 28 commits into from
Closed

Prepare tf2 [WIP] #580

wants to merge 28 commits into from

Conversation

araffin
Copy link
Collaborator

@araffin araffin commented Nov 23, 2019

Description

Prepare the migration to tf2 by deactivating all tests that will fail with tf2.
I also changed the docker image.

Motivation and Context

  • I have raised an issue to propose this change (required for new features and bug fixes)

Related to #366

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (update in the documentation)

Checklist:

  • I've read the CONTRIBUTION guide (required)
  • I have updated the changelog accordingly (required).
  • My change requires a change to the documentation.
  • I have updated the tests accordingly (required for a bug fix or a new feature).
  • I have updated the documentation accordingly.
  • I have ensured pytest and pytype both pass.

@araffin araffin added the v3 Discussion about V3 label Nov 23, 2019
@araffin araffin changed the base branch from tf2 to master November 24, 2019 21:00
@araffin araffin changed the base branch from master to tf2 November 24, 2019 21:00
Antymon and others added 19 commits November 26, 2019 22:02
* Adding action scaling to and from tanh co-domain as a generic utility.

* Formating

* Adding action squashing to tanh co-domain for DDPG, TD3 and SAC whenever sampled at random from action_space.

* Unifying other instances of action scaling withing SAC, TD3 and DDPG. Adding a test.

* Adding info on fix to changelog.

* Flipping action scaling/unscaling due to confusion by parameter naming. Adding test checking involved algorithms.

* Changing names of local variables for actions, in order to follow naming of used action scaling methods.

* Adding check on scaling inferred actions as well

* Considering learning_starts parameter of SAC and TD3 when checking action scaling.

* Removing misclick addition

* Adding to changelog

* Adding nick to bugfix.

* Removing asserts enforcing symmetric action space (DDPG, TD3, SAC).

* Changelog: non-symmetric action spaces info.

* Test Action Scaling: remove unnecessary wrapping of environment, make action space asymmetric.

* Adding comments

* Missing line break

* Removing unused import.
* Refactor and clarify doc for load_results (#582)

* Updates from PR feedback

Added unit test for load_results and get_monitor_files
Add @jbulow in changelog

* Update tests/test_monitor.py

Co-Authored-By: Antonin RAFFIN <[email protected]>

* Update tests/test_monitor.py

Co-Authored-By: Antonin RAFFIN <[email protected]>

* Updates from PR feedback

* Updates from PR feedback

* Updates from PR feedback

Convert path object to string to pass pytype's type check.
* Adding PPO_CPP project description.

* Changing PPO_CPP project description.

* Changelog addition on new dependant project.

* Update changelog.rst

* Added section on C++ portability of Tensorflow models
…tribution (#588)

* Fix - sample type inconsistency in CategoricalProbabilityDistribution

* Adding info on fix to changelog.

* Fix - sample type inconsistency (change sample type of CategoricalProbabilityDistribution, MultiCategoricalProbabilityDistribution to tf.int64)

* Change dtype of actions to int64 of ACER

* Update changelog.rst
* Update to use new close API

* Update custom env documentation to reflect new gym close API

* Update changelog.rst

* Clarifies what reset returns

* Update changelog.rst
* PEP8 fixes

* Update changelog.rst
* Update notebooks links + start rl tips

* Update draft

* Add general advice

* Add limitations

* Add which algo to use

* Correct typos and change colab link

* Polish RL evaluation

* Minor edits

* Update changelog

* Update docs/guide/rl_tips.rst

Co-Authored-By: Adam Gleave <[email protected]>

* Update docs/guide/rl_tips.rst

Co-Authored-By: Adam Gleave <[email protected]>

* Update docs/guide/rl_tips.rst

Co-Authored-By: Adam Gleave <[email protected]>

* Update docs/guide/rl_tips.rst

Co-Authored-By: Adam Gleave <[email protected]>

* Update docs/guide/rl_tips.rst

Co-Authored-By: Adam Gleave <[email protected]>

* Add DeepRL course
* Correct typos

* Add spell check when available

* Update changelog

* Fix space

* Fix HER link
* Add Gym Env checker

* Test common failures

* Declare param as unused

* Update tests/test_envs.py

Co-Authored-By: Adam Gleave <[email protected]>

* Update docs/guide/rl_tips.rst

Co-Authored-By: Adam Gleave <[email protected]>

* Update stable_baselines/common/env_checker.py

Co-Authored-By: Adam Gleave <[email protected]>

* Update docs/guide/rl_tips.rst

Co-Authored-By: Adam Gleave <[email protected]>

* Split checks

* Split tests

* Update stable_baselines/common/env_checker.py

Co-Authored-By: Adam Gleave <[email protected]>

* Update stable_baselines/common/env_checker.py

Co-Authored-By: Adam Gleave <[email protected]>

* Update stable_baselines/common/env_checker.py

Co-Authored-By: Adam Gleave <[email protected]>

* Update stable_baselines/common/env_checker.py

Co-Authored-By: Adam Gleave <[email protected]>

* Update stable_baselines/common/env_checker.py

Co-Authored-By: Adam Gleave <[email protected]>

* Update stable_baselines/common/env_checker.py

Co-Authored-By: Adam Gleave <[email protected]>

* Update stable_baselines/common/env_checker.py

Co-Authored-By: Adam Gleave <[email protected]>

* Update stable_baselines/common/env_checker.py

Co-Authored-By: Adam Gleave <[email protected]>

* Update stable_baselines/common/env_checker.py

Co-Authored-By: Adam Gleave <[email protected]>

* Update stable_baselines/common/env_checker.py

Co-Authored-By: Adam Gleave <[email protected]>

* Update stable_baselines/common/env_checker.py

Co-Authored-By: Adam Gleave <[email protected]>

* Reformat files
* VecNormalize: Add public normalize_{obs..,rew} methods

* Update changelog

* VecNormalize: get_original_{obs,rews}

* VecNormalize: Update rewards in reset()

Note that after the _update_rews() refactor, self.ret doesn't
update anymore if `not self.training`.

* update changelog

* renames

* changelog: fix indent

* changelog: nested list needs blank lines

* Add tests

* Address review, fix tests

* update tests

* More annotations

* Update stable_baselines/common/vec_env/vec_normalize.py

Co-Authored-By: Adam Gleave <[email protected]>

* Address review comments

* Defensive copy
* Bump version

* Add a message to PPO2 assert (closes #625)

* Update replay buffer doctring (closes #610)

* Don't specify a version for pytype

* Fix `VecEnv` docstrings (closes #577)

* Typo

* Re-add python version for pytype
@araffin
Copy link
Collaborator Author

araffin commented Jan 11, 2020

@araffin araffin closed this Jan 11, 2020
@araffin araffin deleted the prepare-tf2 branch February 28, 2020 19:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v3 Discussion about V3
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants