-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update json-schema version to include date format #283
Comments
One other way that we could approach this would be to add to the additional checks - we've already got some on the backlog about checking that dates are the right way round, so we could add "check that dates make sense". We could include:
Technically, we could also write a regex that allowed fewer invalid dates, except that it wouldn't know about leap years. That way lies madness, however, so let's not do that. I am very much up for us moving on from draft-04. I don't immediately know, but I suspect that the |
Does this mean moving to draft 0.6 or 0.7 would mean the id field (just the grant id or all ids?) would become $id in the JSON schema? As we provide a Title for spreadsheet users, would this only impact JSON users directly - eg would we be able to leave the Title as Identifier but it then maps to $id instead of id? Coming back to the dates validation issue. If a change to a new JSON schema version does introduce a breaking change it would need a thorough MAJOR upgrade change management process (as per our governance rules). So incorporating additional checks validation sooner would help ensure that we catch the actual occurrences of invalid dates early so no data needs to be broken in reality (when the time comes). I understand there are potentially two interpretations about what a JSON schema upgrade could mean, based on whether the schema is the Standard because it sets out the rules, or the schema is the way that the rules of the Standard are enforced and so a change to the date formatting could be interpreted as bringing the schema in line with the Standard. My instinct is that the first interpretation is correct because changes to the schema can have the practical impact of breaking data (the definition of a backward incompatible change), even if no-ones data happens to get broken at the time of the change. We'll be discussing this issue at our upcoming Stewardship committee meeting, so I welcome any further thoughts, and if there are perspectives from how other Standards might approach this question. |
My initial take on this is that the According to the list of changes between Draft 04 and Draft 06 (when this change occured), the keyword
The We don't use this in this way, and only create properties called Therefore if we upgraded to JSON Schema Draft 2020-12, we would not break backwards compatibility on this basis. There are other factors, though. The way We'd need to renamed Technically these are all changes to the schema, which is versioned, so it should trigger some version upgrade. However, I'd argue that from the perspective of validation rules and semantics of fields defined – simply upgrading the JSON Schema version and making the required changes could be conceptualised as a PATCH level change. As for the date validation; I think we can approach this in two ways. We can be very strict and say "well this invalidates data that was valid before, therefore is a MAJOR upgrade at minimum." Another approach would be to treat the problem of invalid dates as a bug and therefore say that this is being addressed as a PATCH level fix. We could investigate this by spinning up a branch of the schema (ideally after 1.4 is merged) and experiment with upgrading the JSON schema version, to see what existing data breaks. |
We've noticed that it's possible at the moment for a user to include an invalid date in their file, and for it to not get picked up by cove or other checks.
This is because the standard currently allows for date fields like
awardDate
to be either adate-time
value or to match a regular expression"^[0-9]{4}-(0[1-9]|1[012])-(0[1-9]|[12][0-9]|3[01])$"
. The regex does some validation, but invalid dates are still allowed - for example2020-02-31
would match the regex even though its not a valid date.We currently use JSON schema version
draft-04
. Later versions of the schema allow for adate
format, which I think would allow for us to specify it could be eitherdate-time
ordate
.This in itself would not strictly be backwards-compatible, as there could be data that validates against the current schema version but does not validate if the
date
format is included. But it could be argued that it would better match the current intention of the standard, as it clearly expects a valid date.Upgrading json-schema itself would potentially introduce other breaking changes. The
date
format first appears in draft-07. There are guides for upgrading fromdraft-04
todraft-06
and fromdraft-06
todraft-07
- the former mentions backwards-incompatible changes.The text was updated successfully, but these errors were encountered: