Skip to content

Commit

Permalink
Update environment creation tutorials (#1082)
Browse files Browse the repository at this point in the history
Co-authored-by: ggsavin <[email protected]>
  • Loading branch information
elliottower and ggsavin committed Sep 1, 2023
1 parent f0c94c6 commit c3dc056
Show file tree
Hide file tree
Showing 33 changed files with 114 additions and 27 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/linux-tutorials-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ jobs:
fail-fast: false
matrix:
python-version: ['3.8', '3.9', '3.10', '3.11']
tutorial: ['Tianshou', 'EnvironmentCreation', 'CleanRL', 'SB3/kaz', 'SB3/waterworld', 'SB3/connect_four', 'SB3/test'] # TODO: add back Ray once next release after 2.6.2
tutorial: ['Tianshou', 'CustomEnvironment', 'CleanRL', 'SB3/kaz', 'SB3/waterworld', 'SB3/connect_four', 'SB3/test'] # TODO: add back Ray once next release after 2.6.2
steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
Expand Down
4 changes: 4 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -75,3 +75,7 @@ repos:
additional_dependencies: ["pyright"]
args:
- --project=pyproject.toml
- repo: https://github.com/python-jsonschema/check-jsonschema
rev: 0.26.3
hooks:
- id: check-github-workflows
16 changes: 16 additions & 0 deletions docs/code_examples/aec_rps_usage.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
import aec_rps

env = aec_rps.env(render_mode="human")
env.reset(seed=42)

for agent in env.agent_iter():
observation, reward, termination, truncation, info = env.last()

if termination or truncation:
action = None
else:
# this is where you would insert your policy
action = env.action_space(agent).sample()

env.step(action)
env.close()
2 changes: 2 additions & 0 deletions docs/code_examples/parallel_rps.py
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,7 @@ def reset(self, seed=None, options=None):
self.num_moves = 0
observations = {agent: NONE for agent in self.agents}
infos = {agent: {} for agent in self.agents}
self.state = observations

return observations, infos

Expand Down Expand Up @@ -165,6 +166,7 @@ def step(self, actions):
self.agents[i]: int(actions[self.agents[1 - i]])
for i in range(len(self.agents))
}
self.state = observations

# typically there won't be any information in the infos, but there must
# still be an entry for each agent
Expand Down
11 changes: 11 additions & 0 deletions docs/code_examples/parallel_rps_usage.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
import parallel_rps

env = parallel_rps.parallel_env(render_mode="human")
observations, infos = env.reset()

while env.agents:
# this is where you would insert your policy
actions = {agent: env.action_space(agent).sample() for agent in env.agents}

observations, rewards, terminations, truncations, infos = env.step(actions)
env.close()
19 changes: 19 additions & 0 deletions docs/content/environment_creation.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,11 @@ title: Environment Creation

This documentation overviews creating new environments and relevant useful wrappers, utilities and tests included in PettingZoo designed for the creation of new environments.


We will walk through the creation of a simple Rock-Paper-Scissors environment, with example code for both [AEC](/api/aec/) and [Parallel](/api/aec/) environments.

See our [Custom Environment Tutorial](/tutorials/custom_environment/index) for a full walkthrough on creating custom environments, including complex environment logic and illegal action masking.

## Example Custom Environment

This is a carefully commented version of the PettingZoo rock paper scissors environment.
Expand All @@ -14,13 +19,27 @@ This is a carefully commented version of the PettingZoo rock paper scissors envi
:language: python
```

To interact with your custom AEC environment, use the following code:

```{eval-rst}
.. literalinclude:: ../code_examples/aec_rps_usage.py
:language: python
```

## Example Custom Parallel Environment

```{eval-rst}
.. literalinclude:: ../code_examples/parallel_rps.py
:language: python
```

To interact with your custom parallel environment, use the following code:

```{eval-rst}
.. literalinclude:: ../code_examples/parallel_rps_usage.py
:language: python
```

## Using Wrappers

A wrapper is an environment transformation that takes in an environment as input, and outputs a new environment that is similar to the input environment, but with some transformation or validation applied. PettingZoo provides [wrappers to convert environments](/api/pz_wrappers) back and forth between the AEC API and the Parallel API and a set of simple [utility wrappers](/api/pz_wrappers) which provide input validation and other convenient reusable logic. PettingZoo also includes [wrappers](/api/supersuit_wrappers) via the SuperSuit companion package (`pip install supersuit`).
Expand Down
2 changes: 1 addition & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ environments/third_party_envs
:hidden:
:caption: Tutorials
tutorials/environmentcreation/index
tutorials/custom_environment/index
tutorials/cleanrl/index
tutorials/tianshou/index
tutorials/rllib/index
Expand Down
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
---
title: "Environment Creation"
title: "Custom Environment Tutorial"
---

# Environment Creation Tutorial
# Custom Environment Tutorial

These tutorials walk you though creating a custom environment from scratch, and are recommended as a starting point for anyone new to PettingZoo.
These tutorials walk you though the full process of creating a custom environment from scratch, and are recommended as a starting point for anyone new to PettingZoo.

1. [Project Structure](/tutorials/environmentcreation/1-project-structure.md)

Expand All @@ -14,6 +14,8 @@ These tutorials walk you though creating a custom environment from scratch, and

4. [Testing Your Environment](/tutorials/environmentcreation/4-testing-your-environment.md)

For a simpler example environment, including both [AEC](/api/aec/) and [Parallel](/api/aec/) implementations, see our [Environment Creation](/content/environment_creation/) documentation.


```{toctree}
:hidden:
Expand Down
2 changes: 1 addition & 1 deletion pettingzoo/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@

os.environ["PYGAME_HIDE_SUPPORT_PROMPT"] = "hide"

__version__ = "1.24.0"
__version__ = "1.24.1"

try:
import sys
Expand Down
4 changes: 4 additions & 0 deletions pettingzoo/test/parallel_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,11 @@ def parallel_api_test(par_env: ParallelEnv, num_cycles=1000):
MAX_RESETS = 2
for _ in range(MAX_RESETS):
obs, infos = par_env.reset()

assert isinstance(obs, dict)
assert isinstance(infos, dict)
assert set(obs.keys()) == (set(par_env.agents))
assert set(infos.keys()) == (set(par_env.agents))
terminated = {agent: False for agent in par_env.agents}
truncated = {agent: False for agent in par_env.agents}
live_agents = set(par_env.agents[:])
Expand Down Expand Up @@ -127,3 +130,4 @@ def parallel_api_test(par_env: ParallelEnv, num_cycles=1000):

if len(live_agents) == 0:
break
print("Passed Parallel API test")
14 changes: 14 additions & 0 deletions pettingzoo/utils/conversions.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
# pyright: reportGeneralTypeIssues=false
import copy
import warnings
from collections import defaultdict
Expand Down Expand Up @@ -304,6 +305,19 @@ def reset(self, seed=None, options=None):
self.terminations = {agent: False for agent in self.agents}
self.truncations = {agent: False for agent in self.agents}
self.rewards = {agent: 0 for agent in self.agents}

# Every environment needs to return infos that contain self.agents as their keys
if not self.infos:
warnings.warn(
"The `infos` dictionary returned by `env.reset` was empty. OverwritingAgent IDs will be used as keys"
)
self.infos = {agent: {} for agent in self.agents}
elif set(self.infos.keys()) != set(self.agents):
self.infos = {agent: {self.infos.copy()} for agent in self.agents}
warnings.warn(
f"The `infos` dictionary returned by `env.reset()` is not valid: must contain keys for each agent defined in self.agents: {self.agents}. Overwriting with current info duplicated for each agent: {self.infos}"
)

self._cumulative_rewards = {agent: 0 for agent in self.agents}
self.new_agents = []
self.new_values = {}
Expand Down
2 changes: 1 addition & 1 deletion tutorials/CleanRL/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
pettingzoo[butterfly,atari,testing]>=1.23.1
pettingzoo[butterfly,atari,testing]>=1.24.0
SuperSuit>=3.9.0
tensorboard>=2.11.2
torch>=1.13.1
1 change: 1 addition & 0 deletions tutorials/CustomEnvironment/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
pettingzoo==1.24.0
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,11 @@ def reset(self, seed=None, options=None):
)
for a in self.agents
}
return observations, {}

# Get dummy infos. Necessary for proper parallel_to_aec conversion
infos = {a: {} for a in self.agents}

return observations, infos

def step(self, actions):
# Execute actions
Expand Down Expand Up @@ -85,7 +89,6 @@ def step(self, actions):
if self.timestep > 100:
rewards = {"prisoner": 0, "guard": 0}
truncations = {"prisoner": True, "guard": True}
self.agents = []
self.timestep += 1

# Get observations
Expand All @@ -101,6 +104,9 @@ def step(self, actions):
# Get dummy infos (not used in this example)
infos = {a: {} for a in self.agents}

if any(terminations.values()) or all(truncations.values()):
self.agents = []

return observations, rewards, terminations, truncations, infos

def render(self):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
from pettingzoo import ParallelEnv


class CustomEnvironment(ParallelEnv):
class CustomActionMaskedEnvironment(ParallelEnv):
metadata = {
"name": "custom_environment_v0",
}
Expand Down Expand Up @@ -45,7 +45,11 @@ def reset(self, seed=None, options=None):
"prisoner": {"observation": observation, "action_mask": [0, 1, 1, 0]},
"guard": {"observation": observation, "action_mask": [1, 0, 0, 1]},
}
return observations, {}

# Get dummy infos. Necessary for proper parallel_to_aec conversion
infos = {a: {} for a in self.agents}

return observations, infos

def step(self, actions):
# Execute actions
Expand Down
11 changes: 11 additions & 0 deletions tutorials/CustomEnvironment/tutorial4_testing_the_environment.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
from tutorial2_adding_game_logic import CustomEnvironment
from tutorial3_action_masking import CustomActionMaskedEnvironment

from pettingzoo.test import parallel_api_test

if __name__ == "__main__":
env = CustomEnvironment()
parallel_api_test(env, num_cycles=1_000_000)

env = CustomActionMaskedEnvironment()
parallel_api_test(env, num_cycles=1_000_000)
Empty file.
Empty file.
Empty file.
1 change: 0 additions & 1 deletion tutorials/EnvironmentCreation/requirements.txt

This file was deleted.

This file was deleted.

7 changes: 4 additions & 3 deletions tutorials/Ray/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
PettingZoo[classic, butterfly]==1.23.1
PettingZoo[classic,butterfly]>=1.24.0
Pillow>=9.4.0
ray[rllib]>2.6.2
SuperSuit==3.8.0
# note: currently requires nightly release, see https://docs.ray.io/en/latest/ray-overview/installation.html#daily-releases-nightlies
ray[rllib]>2.6.3
SuperSuit>=3.9.0
torch>=1.13.1
tensorflow-probability>=0.19.0
2 changes: 1 addition & 1 deletion tutorials/SB3/connect_four/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
pettingzoo[classic]>=1.23.1
pettingzoo[classic]>=1.24.0
stable-baselines3>=2.0.0
sb3-contrib>=2.0.0
2 changes: 1 addition & 1 deletion tutorials/SB3/kaz/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
pettingzoo[butterfly]>=1.23.1
pettingzoo[butterfly]>=1.24.0
stable-baselines3>=2.0.0
supersuit>=3.9.0
2 changes: 1 addition & 1 deletion tutorials/SB3/pistonball/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
pettingzoo[butterfly]>=1.23.1
pettingzoo[butterfly]>=1.24.0
stable-baselines3>=2.0.0
supersuit>=3.9.0
2 changes: 1 addition & 1 deletion tutorials/SB3/test/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
pettingzoo[classic]>=1.23.1
pettingzoo[classic]>=1.24.0
stable-baselines3>=2.0.0
sb3-contrib>=2.0.0
pytest
2 changes: 1 addition & 1 deletion tutorials/SB3/waterworld/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
pettingzoo[sisl]>=1.23.1
pettingzoo[sisl]>=1.24.0
stable-baselines3>=2.0.0
supersuit>=3.9.0
pymunk

0 comments on commit c3dc056

Please sign in to comment.