Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Improve validation error messages #8

Closed
wants to merge 2 commits into from

Conversation

Bjwebb
Copy link
Member

@Bjwebb Bjwebb commented Feb 5, 2019

openownership/cove-bods#16

Return specific validation errors for oneOf, by using statementType to pick the correct sub-schema. Only works for BODS.

Because the change is BODS specific, tests are currently only in the lib-cove-bods repo - openownership/lib-cove-bods#5

Here's what this looks like in the web interface openownership/cove-bods#16 (comment)

Bjwebb added a commit to openownership/lib-cove-bods that referenced this pull request Feb 5, 2019
Bjwebb added a commit to openownership/lib-cove-bods that referenced this pull request Feb 5, 2019
@Bjwebb Bjwebb changed the title [WIP] Improve validation error messages for oneOf Improve validation error messages for oneOf Feb 5, 2019
rhiaro
rhiaro previously approved these changes Feb 6, 2019
@Bjwebb Bjwebb changed the title Improve validation error messages for oneOf Improve validation error messages Feb 18, 2019
@Bjwebb Bjwebb force-pushed the cove-bods-16-oneof-validation branch from bf659ff to f17b94f Compare February 19, 2019 07:27
@Bjwebb Bjwebb force-pushed the cove-bods-16-oneof-validation branch from f17b94f to cf824e8 Compare February 19, 2019 07:52
@Bjwebb
Copy link
Member Author

Bjwebb commented Feb 19, 2019

This PR has had the work on themeing and rewording validation errors added to it. See openownership/cove-bods#16 (comment) for more info.

There's a fixture file to display all the errors in the lib-cove-bods repo: https://github.com/openownership/lib-cove-bods/pull/5/files#diff-8a78820a9db4bf7876c2083f107362f9
We probably want to add a test to lib-cove-bods that the output from that is what we expect. We will also need to update the OCDS/360Giving tests in the cove repo.

rhiaro
rhiaro previously approved these changes Feb 20, 2019
@Bjwebb Bjwebb force-pushed the cove-bods-16-oneof-validation branch from fe0ec37 to cf824e8 Compare February 25, 2019 09:50
Bjwebb added a commit to OpenDataServices/cove that referenced this pull request Feb 27, 2019
@Bjwebb
Copy link
Member Author

Bjwebb commented Mar 13, 2019

Currently this PR is held up by considering what affect these changes have on 360Giving and OCDS.

@Bjwebb
Copy link
Member Author

Bjwebb commented Mar 21, 2019

@jpmckinney We have slightly reworded validation error messages, for the Beneficial Ownership Data Standard data review tool. This is mostly to say "should be" rather than "is not". By default these changes will apply to OCDS too, but it's possible for each CoVE instance to have a separate set of messages.

Here's the changes in the code https://github.com/OpenDataServices/lib-cove/pull/8/files#diff-6a1b32728accbec582a1c06a7bdddda7L32

For OCDS this is what the new messages look like:
http://cove-bods-16-reword-validation.dev.cove.opendataservices.coop/review/?source_url=https://raw.githubusercontent.com/OpenDataServices/cove/master/cove_ocds/fixtures/badfile_all_validation_errors.json

Here's the same file in the current CoVE deploy:
http://standard.open-contracting.org/review/?source_url=https://raw.githubusercontent.com/OpenDataServices/cove/master/cove_ocds/fixtures/badfile_all_validation_errors.json

What do you think?

Copy link
Contributor

@jpmckinney jpmckinney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"is not" and "should be" are not interchangeable semantically. For example, it is correct to say "123 is not a string". But "123 should be a string" is, at least, awkward. 123 is a number, and nothing can ever make it a string. What is meant is "the value of field X should be a string (current value is 123)". The value of a field can have different types, but 123 will always have the number type…

So, I prefer "is not" if the subject of the sentence is a value. If the subject is a field, then "should be" works.

Update: Looking at the web UI, I can't tell which of these format strings are substituted with field names and which are substituted with values. The same CSS style is applied to both in the UI, which is confusing. So, some comments may or may not be relevant depending on what's being substituted.

That said, we should ideally have different styles for values versus fields, as this is an important distinction. Perhaps fields can have no background color, and values can have a background color (with both otherwise using the same monotype font)?

'date': 'Date is not in the correct format. The correct format is YYYY-MM-DD.',
'date-time': 'Date is not in the correct format. The correct format is YYYY-MM-DDThh:mm:ssZ.',
'uri': 'Invalid uri found',
'string': '\'{}\' should be a string. Check that the value {} has quotes at the start and end. Escape any quotes in the value with \'\\\'', # noqa
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quoting a non-string makes it look like a string, thus making it harder to see what's wrong… The quote characters should be removed.

One alternative: If the value is a string, first wrap it in double quotes. Then, for the web UI, wrap that value in <code> tags before performing the string substitution. For other UIs, perhaps wrap that value in backticks instead of single quotes.

This same comment applies to all other similar messages. Quoting a non-integer makes it look like a string, so a user might think they have a string where an integer should be.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, looking at the web UI I see fewer single-quote characters than in these messages.

That said, I notice that string values are not wrapped in double-quotes, which makes them harder to identify as strings – so at least that part of my comment is relevant.

validation_error_template_lookup = {
'date': 'Date is not in the correct format. The correct format is YYYY-MM-DD.',
'date-time': 'Date is not in the correct format. The correct format is YYYY-MM-DDThh:mm:ssZ.',
'uri': 'Invalid uri found',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'URI is not in the correct format.' would be more consistent with the date messages.

Also, some messages (like this one) don't end with a period, while most do. They should all end in period, in case a UI e.g. joins multiple error messages together with spaces.

'string': '\'{}\' should be a string. Check that the value {} has quotes at the start and end. Escape any quotes in the value with \'\\\'', # noqa
'integer': '\'{}\' should be an integer. Check that the value {} doesn’t contain decimal points or any characters other than 0-9. Integer values should not be in quotes. ', # noqa
'number': '\'{}\' should be a number. Check that the value {} doesn’t contain any characters other than 0-9 and dot (\'.\'). Number values should not be in quotes. ', # noqa
'boolean': '\'{}\' should be a JSON boolean, \'true\' or \'false\'.', # noqa
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true and false should not be wrapped in single quotes, which implies a string value. See my other comment for an alternative.

'date': 'Date is not in the correct format. The correct format is YYYY-MM-DD.',
'date-time': 'Date is not in the correct format. The correct format is YYYY-MM-DDT00:00:00Z.',
'uri': 'Invalid uri found',
'string': '<code>{}</code> should be a string. Check that the value {} has quotes at the start and end. Escape any quotes in the value with <code>\</code>', # noqa
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second instance of {} should also have <code> tags, to avoid ambiguity or confusion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second {} is for a clause about nulls:

null_clause = ''
if isinstance(e.validator_value, list):
validator_type = e.validator_value[0]
if 'null' not in e.validator_value:
null_clause = 'is not null, and'
else:
null_clause = 'is not null, and'
message_template = validation_error_template_lookup.get(validator_type, message)
message_safe_template = validation_error_template_lookup_safe.get(validator_type)
if message_template:
message = message_template.format(header, null_clause)
if message_safe_template:
message_safe = format_html(message_safe_template, header, null_clause)
but, this could be clearer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks – I didn't read the formatting code, just the message templates. A short comment above the message templates about what values are substituted would improve clarity.

@jpmckinney
Copy link
Contributor

Also, please review the UI for ungrammatical sentences and awkward letter casing. This sentence is an example of both:

The schema version specified in the file is Should be integer.integer

@Bjwebb
Copy link
Member Author

Bjwebb commented Mar 22, 2019

The schema version specified in the file is Should be integer.integer

This is a sample file with bad data in it. I've tried to use the value of the bad data to describe why it is bad. The version in this bad file is literally "Should be integer.integer".

Bjwebb added a commit to openownership/lib-cove-bods that referenced this pull request Mar 22, 2019
This makes the tests suitable for this PR:
OpenDataServices/lib-cove#12
(instead of the larger
OpenDataServices/lib-cove#8)
@Bjwebb Bjwebb changed the title Improve validation error messages [WIP] Improve validation error messages Mar 22, 2019
@jpmckinney
Copy link
Contributor

This is a sample file with bad data in it. I've tried to use the value of the bad data to describe why it is bad. The version in this bad file is literally "Should be integer.integer".

Thanks – but you see how such a file is very confusing in the context of changing wording to "should be" ;)

@jpmckinney
Copy link
Contributor

jpmckinney commented Mar 22, 2019

Before:

"" is too short. Strings must be at least one character. This error typically indicates a missing value.

After:

"id" is too short. Strings must be at least one character. This error typically indicates a missing value.

There seems to be either incorrect string substitution or an incorrect message template here. The first message makes sense in that a string value (indicated by double quotes) is empty. If the second message intends to make the subject a field, then the double quotes should be omitted, and ideally fields should be styled differently from values as I mentioned above.

Same difference across before and after here, with before:

[] is too short. You must supply at least one value, or remove the item entirely (unless it’s required).

After:

tag is too short. You must supply at least one value, or remove the item entirely (unless it’s required).

…ges""

This reverts commit 9f654d0,
effectively reapplying commit cf824e8.
@Bjwebb Bjwebb force-pushed the cove-bods-16-oneof-validation branch from 2c35429 to 0a51727 Compare March 25, 2019 05:39
@Bjwebb Bjwebb force-pushed the cove-bods-16-oneof-validation branch from 0a51727 to 4a32d4a Compare March 25, 2019 05:41
Bjwebb added a commit to openownership/lib-cove-bods that referenced this pull request Mar 28, 2019
@odscjames odscjames changed the base branch from master to main November 11, 2020 13:43
@BibianaC
Copy link
Member

BibianaC commented Feb 4, 2021

@Bjwebb is this PR still relevant?

@Bjwebb
Copy link
Member Author

Bjwebb commented Feb 11, 2021

I don't think we have any plans to take this forward. The new messages exist in lib-cove-bods if we want them.

@Bjwebb Bjwebb closed this Feb 11, 2021
@Bjwebb Bjwebb deleted the cove-bods-16-oneof-validation branch February 11, 2021 13:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants