Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds infos to EpisodeData, tests for infos, and adds standards for infos to documentation #132

Merged
merged 29 commits into from
Jan 25, 2024

Conversation

balisujohn
Copy link
Collaborator

@balisujohn balisujohn commented Aug 21, 2023

Description

Additional Information Formatting

(Taken from new documentation section)

When creating a dataset with DataCollectorV0, the additional information from each timestep to be stored in the infos group of the hdf5 file must be provided to Minari as a dict, which can only contain strings as keys and other dictionaries or np.ndarray as values. An info dict must be provided for every observation(including the one from the initial reset), and the shape of each np.ndarray must stay the same across timesteps, and the keys must remain the same in all dicts across timesteps.

Here is an example of what a valid info might look like:

info = {'value_1':np.array([1]), 'value_2': np.asarray([[2.3],[4.5]])}

Given that it is not guaranteed that all Gymnasium environments provide infos at every timestep, we provide the StepDataCallback which can modify the infos from a non-compliant environment so they have the same structure at every timestep. An example of this pattern is available in our test test_data_collector_step_data_callback_info_correction in test_step_data_callback.py.

For np.ndarray typed arrays including in an info, we support the list of data data types supported by h5py. We provide tests to guarantee support for the following numpy data types: np.int8,np.int16,np.int32,np.int64, np.uint8,np.uint16,np.uint32,np.iunt64,np.float16,np.float32,np.float64.

Additional Comments

This PR also adds tests for DataCollectorV0 record_info=True behavior, and ensures that dtypes are preserved when saving to and loading from disk.

Related Issues

Fixes #125, #131

Checklist:

  • I have run the pre-commit checks with pre-commit run --all-files (see CONTRIBUTING.md instructions to set it up)
  • I have run pytest -v and no errors are present.
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I solved any possible warnings that pytest -v has generated that are related to my code to the best of my knowledge.
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@balisujohn balisujohn marked this pull request as draft August 21, 2023 15:33
@balisujohn balisujohn marked this pull request as ready for review August 25, 2023 05:05
@balisujohn
Copy link
Collaborator Author

Infos PR is ready for review :^).

docs/content/dataset_standards.md Outdated Show resolved Hide resolved
docs/content/dataset_standards.md Outdated Show resolved Hide resolved
docs/content/dataset_standards.md Outdated Show resolved Hide resolved
minari/data_collector/data_collector.py Outdated Show resolved Hide resolved
minari/dataset/minari_storage.py Show resolved Hide resolved
tests/data_collector/test_data_collector.py Outdated Show resolved Hide resolved
tests/utils/test_dataset_creation.py Outdated Show resolved Hide resolved
tests/utils/test_dataset_creation.py Outdated Show resolved Hide resolved
tests/common.py Outdated Show resolved Hide resolved
tests/utils/test_dataset_creation.py Outdated Show resolved Hide resolved
@younik younik self-requested a review January 20, 2024 14:52
@younik younik requested review from alexdavey and removed request for rodrigodelazcano January 22, 2024 17:09
Copy link
Collaborator

@alexdavey alexdavey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Omar! It looks great, I just have some minor comments.

Also: At the moment test_truncation_without_reset only checks that the observation of the previous timestep is carried over, not the infos. It might be useful to check that infos is carried over as well?

docs/content/basic_usage.md Show resolved Hide resolved
minari/dataset/episode_data.py Outdated Show resolved Hide resolved
tests/common.py Outdated Show resolved Hide resolved
tests/common.py Outdated Show resolved Hide resolved
tests/utils/test_dataset_creation.py Outdated Show resolved Hide resolved
tests/common.py Show resolved Hide resolved
@younik
Copy link
Member

younik commented Jan 25, 2024

Thank you @alexdavey for reviewing it!
I believe I addressed everything, lmk if there is something else

@alexdavey
Copy link
Collaborator

alexdavey commented Jan 25, 2024

Thank you @alexdavey for reviewing it! I believe I addressed everything, lmk if there is something else

LGTM. Thank you!

@younik younik merged commit 3e2dcc4 into Farama-Foundation:main Jan 25, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing info property in EpisodeData
4 participants