Skip to content

Conversation

@jbrown-xentity
Copy link
Collaborator

This is to evaluate and provide context and testing for GSA/data.gov#5543.

This PR does a number of things.

  • Provide more detail into documents failing validation
  • Updates JSON schema definition in various places to make more backward compatible and cleaning bad references, including:
    • [fix]: Force keyword list to be an array of strings
    • [fix]: Update identifier to be a string and to be required, not array or null.
    • [backward compatible]: Allowing contactPoint to be an array or a simple object
    • [backward compatible]: Force keywords to not be blank, and the keyword array to not be empty
    • [backward compatible]: Allow modified field to contain duration value compliant with ISO time intervals definition (ie annually)
    • [backward compatible]: Allow spatial to be either an array of locations or a direct reference to a location
    • [backward compatible]: remove catalog requirement for description, publisher, and title (as this information isn't really used)
    • [backward compatible]: remove organization (dataset publisher field) requirement for prefLabel, as in practice this seems duplicate of name, which is already required.

This can be evaluated by running python validate_jsonschema.py from the dcat-us3 folder. Changing things to bad values, or removing required fields, should show the failing job with errors.

Will move this into the "good" folder and iterate on messages.
First will move through schema fixes.
Second will apply necessary metadata fixes for compliance.
For backwards compatibility, and supporting currently in-use and ISO compliant information
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants