[Feature] Added OpenEnv environments#3470
[Feature] Added OpenEnv environments#3470ParamThakkar123 wants to merge 28 commits intopytorch:mainfrom
Conversation
This reverts commit 1f6f327.
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3470
Note: Links to docs will display an error until the docs builds have been completed. ❌ 15 New Failures, 3 Cancelled Jobs, 4 Unrelated FailuresAs of commit f3eb268 with merge base 05212b8 ( NEW FAILURES - The following jobs have failed:
CANCELLED JOBS - The following jobs were cancelled. Please retry:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
BROKEN TRUNK - The following jobs failed but was present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
vmoens
left a comment
There was a problem hiding this comment.
Awesome that you're working on this!
Would it make sense to use chatenv as a base class for consistency? No idea if that makes sense
|
@vmoens I updated my code to use ChatEnv as a base class for OpenEnv |
torchrl/envs/libs/openenv.py
Outdated
| EnvBase.__init__( | ||
| self, | ||
| device=device, | ||
| batch_size=batch_size, | ||
| allow_done_after_reset=allow_done_after_reset, | ||
| spec_locked=spec_locked, | ||
| ) |
There was a problem hiding this comment.
This is basically saying that we have nominally have a ChatEnv but we're not using it.
Either (a) genuinely subclass ChatEnv by calling super().init() and only overriding the specific env-interaction methods (_reset_history, _step_history), or (b) inherit from EnvBase directly and be honest that this is a standalone env that happens to produce ChatHistory-shaped data. Option (a) is clearly preferable for consistency.
If it's not possible, please motivate why and let's move away from ChatEnv entirely.
torchrl/envs/libs/openenv.py
Outdated
| if len(self.batch_size): | ||
| raise ValueError( | ||
| "OpenEnvWrapper does not support batched environments. " | ||
| "Use ParallelEnv to create multiple OpenEnv instances." | ||
| ) |
There was a problem hiding this comment.
ChatEnv enforces batch_size >= (1,) (defaults to (1,) and its spec shapes expect at least one batch dim). OpenEnvWrapper defaults to torch.Size(()) and explicitly rejects non-empty batch sizes.
Use batch_size=(1,) to match the ChatEnv contract, or document why OpenEnv needs () and how that interacts with the existing LLM policy stack. (that will likely break down the line!)
There was a problem hiding this comment.
Almost every interaction with the underlying OpenEnv client is wrapped in bare try/except. This is very bad practice. In general we make everything we can in torchrl to avoid try/except entirely because it can quickly deteriorate UX and performance.
.github/unittest/linux_libs/scripts_openenv/run-clang-format.py
Outdated
Show resolved
Hide resolved
| from tensordict.utils import _zip_strict | ||
| from torch.utils.data import DataLoader | ||
| from torchrl.data import TensorSpec | ||
| from torchrl.envs import StepCounter, Transform |
There was a problem hiding this comment.
are the import changes necessary? Circular imports I imagine
There was a problem hiding this comment.
yeah. Fixed that
There was a problem hiding this comment.
I am getting this error while importing it using from torchrl.envs import StepCounter, Transform:
ImportError while importing test module 'E:\rl\test\test_libs.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
E:\Miniconda3\Lib\importlib\__init__.py:88: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
test\test_libs.py:15: in <module>
import torchrl.testing.env_helper
torchrl\testing\__init__.py:17: in <module>
from torchrl.testing.env_creators import (
torchrl\testing\env_creators.py:12: in <module>
from torchrl.envs import MultiThreadedEnv, ObservationNorm
torchrl\envs\__init__.py:7: in <module>
from .batched_envs import ParallelEnv, SerialEnv
torchrl\envs\batched_envs.py:3156: in <module>
from torchrl.envs.libs.envpool import ( # noqa: F401, E402
torchrl\envs\libs\__init__.py:23: in <module>
from .openenv import OpenEnvEnv, OpenEnvWrapper
torchrl\envs\libs\openenv.py:14: in <module>
from torchrl.envs.llm.chat import ChatEnv
torchrl\envs\llm\__init__.py:7: in <module>
from .datasets import (
torchrl\envs\llm\datasets\__init__.py:7: in <module>
from .gsm8k import GSM8KEnv, GSM8KPrepareQuestion, make_gsm8k_env
torchrl\envs\llm\datasets\gsm8k.py:17: in <module>
from torchrl.envs import StepCounter, Transform
E ImportError: cannot import name 'StepCounter' from partially initialized module 'torchrl.envs' (most likely due to a circular import) (E:\rl\torchrl\envs\__init__.py)
So I changed the imports
| # This source code is licensed under the MIT license found in the | ||
| # LICENSE file in the root directory of this source tree. | ||
|
|
||
| from .history import add_chat_template, ContentBase, History |
Description
Describe your changes in detail.
This PR adds OpenEnv environments to torchrl
Motivation and Context
Why is this change required? What problem does it solve?
If it fixes an open issue, please link to the issue here.
You can use the syntax
close #15213if this solves the issue #15213Types of changes
What types of changes does your code introduce? Remove all that do not apply:
Checklist
Go over all the following points, and put an
xin all the boxes that apply.If you are unsure about any of these, don't hesitate to ask. We are here to help!