-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Come up with strategy for upgrading Is Part Of field values #43
Comments
The metadata converter at https://kgjenkins.github.io/gbl2aardvark/ will now automatically create new "Collections" records, using information from all the existing child records. Some of the fields (subject, keyword, etc.) aggregate all the unique values found in the child records, and the bbox (dcat_bbox, locn_geometry) is automatically expanded to include all the child record bboxes. I've documented the process a bit in the README I think this could be a viable approach, although one would certainly want to review the new collection records -- the descriptions will certainly need editing to better reflect the whole collection. And you may not really want every placename from all the child records to be listed in the collection record. Date values may also require clean-up -- the script keeps every unique value (which works well for single years in
The collection records may also reveal spelling or capitalization inconsistencies in the child records. For example:
Of course, it could be nice to retain a "simple" collection field that just contains a string (similar to subject or keyword), but also have the option of the new relations-based |
In this case, dct_isPartOf_sm probably maps better to pcdm_memberOf_sm. From the OGM documentation: |
Another possible strategy that is supported by OpenGeoMetadata/GeoCombine#143 is to assume that it's possible to get a list of all collection records (in v1 format) before attempting the conversion from v1 to Aardvark. In Earthworks, we apparently use a Once you have a list of collection records and their id_map = {
'My Collection 1' => 'institution:my-collection-1',
'My Collection 2' => 'institution:my-collection-2'
}
GeoCombine::Migrators::V1AardvarkMigrator.new(v1_hash: record, collection_id_map: id_map).run This way, you can convert all records (including collections) at the same time:
An interesting and debatably useful side-effect of this is that it collapses collections with the same name into a single collection. While testing out this strategy, I discovered that several collections in Earthworks are duplicated, probably accidentally. The "2010 China province population census data with GIS maps" collection has this version, with only one member, and this version with several members. While it's possible to have collections with the same name, it doesn't seem desirable from a user standpoint, so using this strategy is an easy way to consolidate duplicate collections at the same time you convert to Aardvark. |
Do some research on a new field for this that would be a plain text value. |
One of the main incompatibilities between Metadata 1.0 and Aardvark is the Is Part Of field. In the 1.0, this was a string value. In Aardvark, this is an ID that is read by the GeoBlacklight application to link records together.
To upgrade, users would need to create new collection records for each unique value and replace the strings with the new IDs.
Pros:
Cons:
The text was updated successfully, but these errors were encountered: