Cogment Lab is a toolkit for doing HILL RL -- that is human-in-the-loop learning, with an emphasis on reinforcement learning. It is based on cogment, a low-level framework for exchanging messages between environments, AI agents and humans. It's the perfect tool for when you want to interact with your environment yourself, and maybe even trained AI agents.
- Activate your venv, conda env, or whatever you use to keep your python environment clean.
- Install cogment_lab with
pip install cogment_lab
- Install cogment in
COGMENT_LAB_HOME
folder withcogmentlab install
(this environment variable defaults to~/.cogment_lab
) - In a separate terminal, run
cogmentlab launch base
to start the orchestrator and datastore. Keep it open. - Run the tutorials, examples, or whatever you want to do.
While it typically isn't necessary to interact with Cogment directly to use Cogment Lab, it is useful to understand the principles on which it operates.
Cogment exchanges messages between environments and actor. These messages contain the observations, actions, rewards, and anything else that you might want to keep track of.
Interactions are split into Trials, which correspond to the typical notion of an episode in RL. Each trial has a unique ID, and
Cogment Lab (as well as Cogment in general) follows a microservice-based architecture. Each environment, agent, and human interface (collectively: service) is launched as a subprocess, and exchanges messages with the orchestrator, which in turn ensures synchronization and correct routing of messages.
Generally speaking, you don't need to worry about any of that - Cogment Lab conveniently covers up all the rough edges, allowing you to do your research without worries.
Cogment Lab is inherently asynchronous - but if you're not familiar with async python, don't worry about it. The only things you need to remember are:
- Wrap your code in
async def main()
- Run it with
asyncio.run(main())
- When calling certain functions use the
await
keyword, e.g.data = await cog.get_episode_data(...)
If you are familiar with async programming, there's a lot of interesting things you can do with it - go crazy.
- A
service
is anything that interacts with the Cogment orchestrator. It can be an environment or an actor, including human actors. - An
actor
in particular is the service that interacts with an environment, and often wraps anagent
. The internal structure of an actor is entirely up to the user - An
agent
is what we typically think of as an agent in RL - something that perceives its environment and acts upon it. We do not attempt to solve the agent foundation problem in this documentation. - An
agent
is simultaneously the part of the environment that's taking an action - multiagent environments may have several agents, so we need to assign an actor to each agent.
- When running the web UI, you can open the tab only once per launched process. So if you open the UI, you can run however many trials you want, as long as you don't close it. If you do close it, you should kill the process and start a new one.
-
Requires Python 3.10
-
Install requirements in a virtual env with something similar to the following
$ python -m venv .venv $ source .venv/bin/activate $ pip install -r requirements.txt $ pip install -e .
-
For the examples you'll need to install the additional
examples_requirements.txt
.
To run on M1/2/3 macs, you'll need to perform those additional steps
pip uninstall grpcio grpcio-tools
export GRPC_PYTHON_LDFLAGS=" -framework CoreFoundation"
pip install grpcio==1.60.0 grpcio-tools==1.60.0 --no-binary :all:
Adjust the version (here 1.60.0) to whatever you have installed.
Run cogmentlab launch base
.
Then, run whatever scripts or notebooks.
Terminology:
- Model: a relatively raw PyTorch (or other?) model, inheriting from
nn.Module
- Agent: a model wrapped in some utility class to interact with np arrays
- Actor: a cogment service that may involve models and/or actors
People having maintainers rights of the repository can follow these steps to release a version MAJOR.MINOR.PATCH. The versioning scheme follows Semantic Versioning.
- Run
./scripts/create_release_branch.sh MAJOR.MINOR.PATCH
, this will automatically:- update the version of the package, in
cogment_lab/version.py
, - create a release branch with the changes at
release/vMAJOR.MINOR.PATCH
and push it.
- update the version of the package, in
- On the release branch:
- Make sure the changelog, at
CHANGELOG.md
, reflects the changes since the last release, - Fix any issue, making sure that the build passes on CI,
- Commit and push any changes.
- Make sure the changelog, at
- Run
./scripts/tag_release.sh MAJOR.MINOR.PATCH
, this will automatically:- create the specific version section in the changelog and push it to the release branch,
- merge the release branch in
main
, - create the release tag and,
- update the
develop
to match the latest release.
- The CI will automatically publish the package to PyPI.