Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache flag-related objects and refactor extraction of flags #520

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

Descanonge
Copy link
Contributor

The dictionary of flag values/masks is cached in the CFAccessor, moving the code of create_flag_dict into a property of the accessor.

The dataset of booleans masks is cached in the CFDataArrayAccessor. I don't know if this is a welcomed change. I would argue that if it is presented as a property (.flags), I would expect it to be the same instance on different calls, contrary to a method like .flags() or .get_flags(). But I'm not very familiar with the rest of the cf-xarray API. Anyway it can be reverted easily.

I have also refactored the construction of that flag dataset to make it (hopefully) more readable and understable. It makes use of the FlagParams named-tuple. The operations on the data are unchanged.

If both: Mix of independent and mutually exclusive flags.
"""
if self._flag_dict is not None:
return self._flag_dict
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that you can change the underlying values in the DataArray and then the flags will be wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot that if for DataArray.data most operations are not done in place (you get a new DataArray), changing DataArray.attrs is easily done in place. I only see either checking if any of the attributes have changed, or urging to use DataArray.assign_attrs() in the documentation...

@dcherian
Copy link
Contributor

I have also refactored the construction of that flag dataset to make it (hopefully) more readable and understable.

Nice!

Re: API if it is useful to get flag_dict then perhaps we should create a Flags dataclass with as_dict and as_dataset methods?

@Descanonge
Copy link
Contributor Author

Re: API if it is useful to get flag_dict then perhaps we should create a Flags dataclass with as_dict and as_dataset methods?

I am not sure I'm following. Something like datarray.cf.flags.as_dict ? It would regroup the code for flags, which is nice. But as_dataset needs to check that the object is an array and not a Dataset (or maybe select a valid flag variable in the Dataset if there is only one). It could essentially do create_flag_dict on init, so that would be cached, and making the flag dataset available from a method seems reasonable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants