Skip to content

Support guid for zfs/zpool get and zdb #18029

@dodexahedron

Description

@dodexahedron

Describe the feature you would like to see added to OpenZFS

It would be handy for the zfs, zpool, and zdb utilities to support getting objects by guid to, for example, allow automated processes (think snapshot managers and such, especially) to cache guids and directly reference objects that way, without having to worry about if they have been renamed since they were first retrieved.

objsetid comes semi-close to that and can be used with zdb for certain operations. However, objsetid is not idempotent, so it is unsuitable for durable references to a dataset, since it is possible for the original dataset to be destroyed and a new one created that ends up with the same objsetid at some point in the future.

How will this feature improve OpenZFS?

Currently, there is no idempotent means of referring to a dataset between operations using the command line utilities without getting the entire tree with names and guids and then post-filtering. On a large pool, this can be extremely time-consuming, and represents a step that otherwise would not be necessary to perform if datasets could be retrieved by guid instead.

This feature would allow for significant reduction in resource consumption and required operations for automated processes, scripts, etc. to do things safely at different points in time, agnostic of the names of the datasets.

Considerations

  • The guid property isn't actually "GU", for datasets (though it is for pools, at least on the same system) and duplicate guids can exist even within the same pool which do not belong to the same datasets.
    • This is made even more possible/likely when sending/receiving datasets across pools.
    • Possible workaround for operations within the scope of a single system is for the user to make use of the combination of guid and objsetid to resolve any collisions on use, while the utilities could simply return all matching datasets for the guid and leave collision resolution to the user.
  • Dataset names beginning with numbers are valid
    • A couple of simple solutions could be:
      • Give the utilities a switch option (such as -g) to indicate that the dataset names are to be interpreted as guids instead
      • Give the utilities a separate verb for getting objects by guid, instead of including it in the get verb.
  • libzfs does a lot of things by name to begin with.
    • At least the last time I dove into the source, it looks like the initial operations to retrieve individual datasets do things by name, and then passes the pointer to it around for the rest of the operation.
    • That leads me to assume that guid is in no way used or treated as a key, for datasets, which would mean certain functions would need to be at least partially duplicated to make this even possible in the first place, if my assumption is correct. Still, scanning the tree for an object containing a 64-bit numeric value seems like it should be a lighter-weight operation for the system than a string match anyway, especially as the length of the dataset name increases (ie Θ(1) for guid comparison vs non-constant for name comparison)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type: FeatureFeature request or new feature

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions