Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature(zjow): add new pipeline agent sac/ddpg/a2c #637

Merged
merged 274 commits into from
Sep 14, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
274 commits
Select commit Hold shift + click to select a range
bb0df37
polish code
zjowowen Feb 28, 2023
d01558d
polish code
zjowowen Feb 28, 2023
60f47b6
polish code
zjowowen Feb 28, 2023
511d71e
polish code
zjowowen Feb 28, 2023
6a9fd45
polish code
zjowowen Feb 28, 2023
d5573e9
polish code
zjowowen Feb 28, 2023
b7c2011
Merge branch 'main' of https://github.com/zjowowen/DI-engine into ben…
zjowowen Mar 1, 2023
0a167f1
Merge branch 'opendilab:main' into benchmark-3
zjowowen Mar 2, 2023
3906543
fix data type error for mujoco
zjowowen Mar 2, 2023
e665493
polish code
zjowowen Mar 2, 2023
88f5181
polish code
zjowowen Mar 2, 2023
693a4cb
Add features
zjowowen Mar 2, 2023
e6bd0c5
fix base env manager readyimage
zjowowen Mar 3, 2023
cdb9928
polish code
zjowowen Mar 3, 2023
3015a92
remove NoReturn
zjowowen Mar 3, 2023
6e7041b
remove NoReturn
zjowowen Mar 3, 2023
c97a8d4
Merge branch 'main' of https://github.com/zjowowen/DI-engine into ben…
zjowowen Mar 6, 2023
fe415b2
format code
zjowowen Mar 7, 2023
8f808b2
merge from main
zjowowen Mar 7, 2023
3432754
format code
zjowowen Mar 7, 2023
3f6ef3d
polish code
zjowowen Mar 7, 2023
535fd77
polish code
zjowowen Mar 7, 2023
4271610
fix logger
zjowowen Mar 7, 2023
ba0979b
format code
zjowowen Mar 7, 2023
3c19c2c
Merge branch 'main' of https://github.com/zjowowen/DI-engine into ben…
zjowowen Mar 7, 2023
82826e2
format code
zjowowen Mar 7, 2023
da0dd12
Merge branch 'main' of https://github.com/zjowowen/DI-engine into ben…
zjowowen Mar 7, 2023
bb35f90
Merge branch 'main' of https://github.com/zjowowen/DI-engine into ben…
zjowowen Mar 10, 2023
5340658
change api for ckpt; polish code
zjowowen Mar 10, 2023
2d3f6c8
polish code
zjowowen Mar 13, 2023
2e8292c
merge from main
zjowowen Mar 13, 2023
2f883d7
format code
zjowowen Mar 13, 2023
3c15c84
polish code
zjowowen Mar 13, 2023
6ce1421
fix load bug
zjowowen Mar 13, 2023
eac9434
fix bug
zjowowen Mar 13, 2023
6fda31b
fix dtype error
zjowowen Mar 14, 2023
6b9def4
polish code
zjowowen Mar 15, 2023
6f49d0a
polish code
zjowowen Mar 15, 2023
cdafb55
Add dqn agent
zjowowen Mar 16, 2023
b5788e2
merge from main
zjowowen Mar 17, 2023
b181597
add config
zjowowen Mar 17, 2023
b534448
merge from main
zjowowen Mar 20, 2023
1d91b6d
add bonus/c51.py
zhangpaipai Mar 20, 2023
ef5f1d5
add c51 logit monitor
zhangpaipai Mar 27, 2023
87822ba
add sac dqn agent
zjowowen Mar 28, 2023
c86d897
add sac dqn agent demo in dizoo
zjowowen Mar 28, 2023
0f06aa6
merge from main
zjowowen Mar 28, 2023
1973d01
polish format
zjowowen Mar 28, 2023
7a832ee
pull zjow new-pipeline-agent
zhangpaipai Mar 28, 2023
1c111c2
polish code
zjowowen Mar 28, 2023
3a05437
polish code
zjowowen Mar 28, 2023
54b1a09
fix ddpg bug
zjowowen Mar 28, 2023
1caefff
merge nyz c51/dqn config and policy
zhangpaipai Mar 28, 2023
646b005
merge from main
zjowowen Mar 29, 2023
6a8d535
fix config
zjowowen Mar 29, 2023
6fb3534
remove mutistep_trainer
zhangpaipai Mar 29, 2023
c54f220
fix bug
zjowowen Mar 29, 2023
557102e
polish code
zjowowen Mar 29, 2023
95f995c
polish code
zjowowen Mar 29, 2023
c5e9a52
polish code
zjowowen Mar 30, 2023
01b82c7
add Hopper demo
zjowowen Mar 31, 2023
0d60070
polish code
zjowowen Mar 31, 2023
3f3fb68
add property best
zjowowen Apr 3, 2023
49cab88
merge from main
zjowowen Apr 3, 2023
dc5aa8c
add a2c pipeline
zjowowen Apr 3, 2023
ccb2fcf
add sac halfcheetah+walker2d
zhangpaipai Apr 5, 2023
84bef89
pull zjow new-pipeline-agent
zhangpaipai Apr 5, 2023
c937f3b
fix a2c pipeline bug
zjowowen Apr 6, 2023
27ff425
fix pipeline bug
zjowowen Apr 6, 2023
02bc7f0
fix bug
zjowowen Apr 6, 2023
6fb854f
change config
zjowowen Apr 6, 2023
bbf7e2d
merge from main
zjowowen Apr 7, 2023
a76408c
remove IMPALA pipeline
zjowowen Apr 7, 2023
fd7f922
format code
zjowowen Apr 7, 2023
70009ae
polish code
zjowowen Apr 7, 2023
772c354
polish c51 and add ddpg halfcheetah walker2d
zhangpaipai Apr 10, 2023
a7513d8
pull zjow new-pipeline-agent again
zhangpaipai Apr 10, 2023
12d6291
add dizoo/common for zjow to review
zhangpaipai Apr 11, 2023
fec830a
fix agent best method
zjowowen Apr 11, 2023
471aff4
Merge branch 'new-pipeline-agent' of https://github.com/zjowowen/DI-e…
zhangpaipai Apr 11, 2023
8f523e7
reset dizoo
zjowowen Apr 11, 2023
0f5015e
delete common
zhangpaipai Apr 11, 2023
151079c
Merge branch 'new-pipeline-agent' of https://github.com/zjowowen/DI-e…
zhangpaipai Apr 11, 2023
d5cdb1e
polish for zjow to review
zhangpaipai Apr 12, 2023
511dfad
merge from main
zjowowen Apr 13, 2023
d69b165
polish code
zjowowen Apr 13, 2023
b95f340
polish code
zjowowen Apr 13, 2023
b6be677
fix bug
zjowowen Apr 13, 2023
516780b
fix bug
zjowowen Apr 13, 2023
98e4d46
Merge branch 'new-pipeline-agent' of https://github.com/zjowowen/DI-e…
zhangpaipai Apr 14, 2023
93008aa
polish c51
zhangpaipai Apr 14, 2023
83861f8
merge from main
zjowowen Apr 25, 2023
883ce54
Merge branch 'new-pipeline-agent' of https://github.com/zjowowen/DI-e…
zhangpaipai Apr 26, 2023
c7f5ad6
add pg agent
zjowowen Apr 27, 2023
36a7dfa
merge from main
zjowowen Apr 27, 2023
1941951
add pendulum config
zhangpaipai Apr 28, 2023
af7272a
add c51_atari td3_pendulum,bipedalwalker ddpg_pendulum
zhangpaipai Apr 28, 2023
eafeada
polish code
zjowowen Apr 28, 2023
d21839c
merge from main
zjowowen Apr 28, 2023
68a738e
polish code
zjowowen Apr 28, 2023
fdc6408
polish code
zjowowen Apr 28, 2023
c67622d
merge zjow
zhangpaipai Apr 29, 2023
5cf69d6
add bipedalwalker_ddpg_config
zhangpaipai May 5, 2023
995e39c
merge from main
zjowowen May 9, 2023
7e03fc1
merge zjow
zhangpaipai May 9, 2023
f222d42
feature(zp): add c51
zjowowen May 9, 2023
ca63569
change config
zjowowen May 9, 2023
2e8978b
change bipedalwalker config and noframeskip
zhangpaipai May 11, 2023
aa3367d
polish c51-atari name
zhangpaipai May 15, 2023
4b7aa50
add pong spaceinvaders and qbert for dqn
ruoyuGao May 15, 2023
134e3e5
merge from main
zjowowen May 16, 2023
4c08017
git fetch
zjowowen May 16, 2023
efc807e
polish code
zjowowen May 16, 2023
eed925f
polish code; add env mode
zjowowen May 16, 2023
f37f65b
add rew_clip in ding_env_wrapper
zhangpaipai May 16, 2023
59cc61b
polish dqn atari
zhangpaipai May 19, 2023
4b2ffcd
merge from main
zjowowen May 23, 2023
8b04a11
merge from new-pipeline-agent
zjowowen May 23, 2023
b1aab8d
add a2c continuous action space
zjowowen May 23, 2023
0584404
add a2c continuous action space
zjowowen May 23, 2023
f651f68
add a2c continuous for mujoco
zjowowen May 23, 2023
92bfff3
add a2c continuous for mujoco
zjowowen May 23, 2023
a72de14
add a2c continuous for mujoco
zjowowen May 23, 2023
4e59519
add a2c mujoco config; add ppo atari config
zjowowen May 24, 2023
f104d81
add a2c mujoco config; add ppo atari config
zjowowen May 24, 2023
308e25a
fix a2c deploy bug
zjowowen May 24, 2023
522b0ff
Add bipedalwalker a2c
zjowowen May 25, 2023
1e87b1d
polish code
zjowowen May 25, 2023
06f4046
polish code
zjowowen May 25, 2023
7fc7032
polish code
zjowowen May 25, 2023
bb74395
polish code
zjowowen May 25, 2023
5ea9233
polish code
zjowowen May 29, 2023
59b7080
add pendulum a2c+pg
zhangpaipai May 30, 2023
d96ce90
Merge branch 'new-pipeline-agent' of https://github.com/zjowowen/DI-e…
zhangpaipai May 30, 2023
e6e100b
add pg bipedalwalker+mujoco
zhangpaipai May 30, 2023
f69c448
polish code for wandb sweep
zjowowen May 30, 2023
98877de
polish code for wandb sweep
zjowowen May 30, 2023
dbec6a7
polish code for wandb sweep
zjowowen May 30, 2023
d2d7e8e
polish code for a2c mujoco
zjowowen May 30, 2023
168fd41
add pg pendulum new pipeline
zhangpaipai May 31, 2023
de2d180
fix scalar action bug in random collect
zjowowen May 31, 2023
1a2d4dd
polish pg algorithm
zhangpaipai Jun 1, 2023
498c094
Merge branch 'new-pipeline-agent' of https://github.com/zjowowen/DI-e…
zhangpaipai Jun 1, 2023
1221565
add bonus pg config
zhangpaipai Jun 1, 2023
1018dda
polish pg config
zhangpaipai Jun 1, 2023
f197844
polish config
zjowowen Jun 1, 2023
2e578ad
merge from main
zjowowen Jun 1, 2023
0bc6923
polish code
zjowowen Jun 1, 2023
5473706
change pendulum pg config
zhangpaipai Jun 1, 2023
db8176b
fix continuous action dim=1 bug
zjowowen Jun 1, 2023
43b0c3e
merge from main
zjowowen Jun 1, 2023
44a3047
merge from origin main
zjowowen Jun 1, 2023
d16fa86
Add ppof lr scheduler
zjowowen Jun 5, 2023
eb86c63
polish config
zjowowen Jun 6, 2023
eab7912
fix random collect bug for dqn
zjowowen Jun 6, 2023
98a9017
polish ppo qbert spaceinvader config
zjowowen Jun 7, 2023
b52d8f1
remove mujoco wrapper
zjowowen Jun 9, 2023
8b15b52
polish a2c mujoco config; add ppo offpolicy agent pipeline
zjowowen Jun 9, 2023
c915f33
merge from main
zjowowen Jun 9, 2023
dc61317
Add wandb monitor evaluate return std
zjowowen Jun 9, 2023
ea5f1e7
polish deploy method
zjowowen Jun 9, 2023
35a21b4
format code
zjowowen Jun 9, 2023
f95e8eb
polish code
zjowowen Jun 13, 2023
603fa5e
polish pg pendulum+hopper config
zhangpaipai Jun 13, 2023
6b874dc
Merge branch 'new-pipeline-agent' of https://github.com/zjowowen/DI-e…
zhangpaipai Jun 13, 2023
ddd6550
fix data shape bug
zjowowen Jun 13, 2023
d25228e
merge from remote
zjowowen Jun 13, 2023
df7963d
merge from main
zjowowen Jun 13, 2023
a7c3cf4
fix ppo offpolicy deploy bug
zjowowen Jun 13, 2023
ea979e8
fix mujoco reward action env clip bug
zjowowen Jun 13, 2023
ff7f639
fix mujoco reward action env clip bug
zjowowen Jun 13, 2023
ed5b1a3
fix deploy env mode bug
zjowowen Jun 14, 2023
05f8c47
fix env reset bug for deployment and evaluation
zjowowen Jun 14, 2023
df20033
Add ppo offpolicy atari config
zjowowen Jun 25, 2023
4cc8eac
merge from main
zjowowen Jun 27, 2023
5ecc9dc
polish config
zjowowen Jun 29, 2023
1e5ec1a
merge from main
zjowowen Jul 9, 2023
c621c35
polish config code
zjowowen Jul 10, 2023
41786e3
polish code; add SQL
zjowowen Jul 10, 2023
ebcefb4
polish code
zjowowen Jul 10, 2023
420ef72
polish code
zjowowen Jul 10, 2023
d9d93dd
polish code
zjowowen Jul 10, 2023
aa1f39d
polish code
zjowowen Jul 10, 2023
653a00b
change config path
zjowowen Jul 11, 2023
57e7325
add compatibility fix for nstep
zjowowen Jul 11, 2023
0754dd9
polish code
zjowowen Jul 11, 2023
0919f06
Add ppo offpolicy continuous policy
zjowowen Jul 12, 2023
d958f49
polish config
zjowowen Jul 14, 2023
ab0fdda
add ppo offpolicy general action modeling
zjowowen Jul 17, 2023
0c1f2b6
add dependencies
zjowowen Jul 17, 2023
9336a0a
polish config
zjowowen Jul 18, 2023
ced06f8
polish deploy
zjowowen Jul 18, 2023
a8822fd
Add array video helper
zjowowen Jul 18, 2023
8d152e0
polish deploy
zjowowen Jul 18, 2023
2e2db04
merge from main
zjowowen Jul 19, 2023
e063d77
polish config
zjowowen Jul 19, 2023
afb6355
polish setup
zjowowen Jul 20, 2023
0863b0b
fix config bug
zjowowen Jul 22, 2023
c934ef6
polish code
zjowowen Jul 25, 2023
a1f3e94
polish code
zjowowen Jul 25, 2023
da9d2c1
polish code
zjowowen Jul 25, 2023
af3d101
merge from main
zjowowen Jul 26, 2023
92d9504
fix bug in evaluator
zjowowen Jul 26, 2023
1f0704c
polish code
zjowowen Jul 27, 2023
5a08ec7
polish code
zjowowen Jul 27, 2023
1774224
merge from main
zjowowen Aug 3, 2023
02f90cf
merge from main
zjowowen Aug 8, 2023
c9e736a
polish code
zjowowen Aug 8, 2023
b87eabc
merge from main
zjowowen Aug 15, 2023
3addb8b
merge from main
zjowowen Aug 21, 2023
7ee7553
merge from main
zjowowen Aug 22, 2023
26b733d
merge from main
zjowowen Aug 28, 2023
e318d34
merge from main
zjowowen Aug 29, 2023
9e1a17c
fix bug in ckpt_saver order
zjowowen Aug 29, 2023
dc1af4d
fix format
zjowowen Aug 30, 2023
d857516
fix format
zjowowen Aug 30, 2023
489819f
fix format
zjowowen Aug 30, 2023
927f367
fix bug in reward shape
zjowowen Aug 30, 2023
fc3350f
format type
zjowowen Aug 30, 2023
f450936
merge from main
zjowowen Sep 4, 2023
0d9b73e
polish code
zjowowen Sep 4, 2023
e5a7ac6
polish code
zjowowen Sep 4, 2023
6ba49c4
fix nstep error for ppo offpolicy
zjowowen Sep 5, 2023
d36d395
fix bug in action shape of cql when dim is 1
zjowowen Sep 6, 2023
338b23f
Merge branch 'main' of https://github.com/zjowowen/DI-engine into new…
zjowowen Sep 8, 2023
f0263b0
Merge branch 'main' of https://github.com/zjowowen/DI-engine into new…
zjowowen Sep 11, 2023
c0e0d31
polish code
zjowowen Sep 11, 2023
7e61686
delete config not work
zjowowen Sep 11, 2023
fdf7738
polish code; remove ppof general datatype
zjowowen Sep 11, 2023
eec560b
remove useless code
zjowowen Sep 11, 2023
b53cd84
polish code
zjowowen Sep 11, 2023
e8495ea
polish code
zjowowen Sep 11, 2023
a4fc129
fix a2c unittest
zjowowen Sep 11, 2023
9985ade
fix advantages_estimator unittest
zjowowen Sep 11, 2023
016a340
fix combination_argmax_sample unittest
zjowowen Sep 11, 2023
16c5365
fix unittest bug
zjowowen Sep 11, 2023
9746afd
fix unittest bug
zjowowen Sep 11, 2023
01d5399
fix wandb logger unittest bug
zjowowen Sep 11, 2023
1a46c04
polish code
zjowowen Sep 12, 2023
2b742b8
move config position
zjowowen Sep 12, 2023
e1edb6a
remove useless config
zjowowen Sep 12, 2023
90b6d27
polish code
zjowowen Sep 12, 2023
a28c3f0
add unittest for montecarlo_return_estimator
zjowowen Sep 12, 2023
b538f37
fix bug in termination checker
zjowowen Sep 12, 2023
e283e0c
polish code
zjowowen Sep 12, 2023
46b728e
Merge branch 'main' into new-pipeline-agent
PaParaZz1 Sep 13, 2023
c6a34a6
polish code
zjowowen Sep 13, 2023
626089a
polish code
zjowowen Sep 13, 2023
d4b0c5a
polish code
zjowowen Sep 13, 2023
2ed3036
polish code
zjowowen Sep 13, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
132 changes: 131 additions & 1 deletion ding/bonus/__init__.py
Original file line number Diff line number Diff line change
@@ -1,2 +1,132 @@
import ding.config
from .a2c import A2CAgent
from .c51 import C51Agent
from .ddpg import DDPGAgent
from .dqn import DQNAgent
from .pg import PGAgent
from .ppof import PPOF
from .td3 import TD3OffPolicyAgent
from .ppo_offpolicy import PPOOffPolicyAgent
from .sac import SACAgent
from .sql import SQLAgent
from .td3 import TD3Agent

supported_algo = dict(
A2C=A2CAgent,
C51=C51Agent,
DDPG=DDPGAgent,
DQN=DQNAgent,
PG=PGAgent,
PPOF=PPOF,
PPOOffPolicy=PPOOffPolicyAgent,
SAC=SACAgent,
SQL=SQLAgent,
TD3=TD3Agent,
)

supported_algo_list = list(supported_algo.keys())


def env_supported(algo: str = None) -> list:
"""
return list of the envs that supported by di-engine.
"""

if algo is not None:
if algo.upper() == "A2C":
return list(ding.config.example.A2C.supported_env.keys())
elif algo.upper() == "C51":
return list(ding.config.example.C51.supported_env.keys())
elif algo.upper() == "DDPG":
return list(ding.config.example.DDPG.supported_env.keys())
elif algo.upper() == "DQN":
return list(ding.config.example.DQN.supported_env.keys())
elif algo.upper() == "PG":
return list(ding.config.example.PG.supported_env.keys())
elif algo.upper() == "PPOF":
return list(ding.config.example.PPOF.supported_env.keys())
elif algo.upper() == "PPOOFFPOLICY":
return list(ding.config.example.PPOOffPolicy.supported_env.keys())
elif algo.upper() == "SAC":
return list(ding.config.example.SAC.supported_env.keys())
elif algo.upper() == "SQL":
return list(ding.config.example.SQL.supported_env.keys())
elif algo.upper() == "TD3":
return list(ding.config.example.TD3.supported_env.keys())
else:
raise ValueError("The algo {} is not supported by di-engine.".format(algo))
else:
supported_env = set()
supported_env.update(ding.config.example.A2C.supported_env.keys())
supported_env.update(ding.config.example.C51.supported_env.keys())
supported_env.update(ding.config.example.DDPG.supported_env.keys())
supported_env.update(ding.config.example.DQN.supported_env.keys())
supported_env.update(ding.config.example.PG.supported_env.keys())
supported_env.update(ding.config.example.PPOF.supported_env.keys())
supported_env.update(ding.config.example.PPOOffPolicy.supported_env.keys())
supported_env.update(ding.config.example.SAC.supported_env.keys())
supported_env.update(ding.config.example.SQL.supported_env.keys())
supported_env.update(ding.config.example.TD3.supported_env.keys())
# return the list of the envs
return list(supported_env)


supported_env = env_supported()


def algo_supported(env_id: str = None) -> list:
"""
return list of the algos that supported by di-engine.
"""
if env_id is not None:
algo = []
if env_id.upper() in [item.upper() for item in ding.config.example.A2C.supported_env.keys()]:
algo.append("A2C")
if env_id.upper() in [item.upper() for item in ding.config.example.C51.supported_env.keys()]:
algo.append("C51")
if env_id.upper() in [item.upper() for item in ding.config.example.DDPG.supported_env.keys()]:
algo.append("DDPG")
if env_id.upper() in [item.upper() for item in ding.config.example.DQN.supported_env.keys()]:
algo.append("DQN")
if env_id.upper() in [item.upper() for item in ding.config.example.PG.supported_env.keys()]:
algo.append("PG")
if env_id.upper() in [item.upper() for item in ding.config.example.PPOF.supported_env.keys()]:
algo.append("PPOF")
if env_id.upper() in [item.upper() for item in ding.config.example.PPOOffPolicy.supported_env.keys()]:
algo.append("PPOOffPolicy")
if env_id.upper() in [item.upper() for item in ding.config.example.SAC.supported_env.keys()]:
algo.append("SAC")
if env_id.upper() in [item.upper() for item in ding.config.example.SQL.supported_env.keys()]:
algo.append("SQL")
if env_id.upper() in [item.upper() for item in ding.config.example.TD3.supported_env.keys()]:
algo.append("TD3")

if len(algo) == 0:
raise ValueError("The env {} is not supported by di-engine.".format(env_id))
return algo
else:
return supported_algo_list


def is_supported(env_id: str = None, algo: str = None) -> bool:
"""
Check if the env-algo pair is supported by di-engine.
"""
if env_id is not None and env_id.upper() in [item.upper() for item in supported_env.keys()]:
if algo is not None and algo.upper() in supported_algo_list:
if env_id.upper() in env_supported(algo):
return True
else:
return False
elif algo is None:
return True
else:
return False
elif env_id is None:
if algo is not None and algo.upper() in supported_algo_list:
return True
elif algo is None:
raise ValueError("Please specify the env or algo.")
else:
return False
else:
return False
Loading
Loading