-
-
Notifications
You must be signed in to change notification settings - Fork 799
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Unquoted Values #723
Comments
Yes, this would require a new option like One thing to note is that the definition of where an unquoted value ends will probably require "interesting" heuristics -- given that such values are not legal JSON (and shame on org.json / Gson accepting them by default...), there is no definition of how such values should be decoded; choice of end markers is rather large. |
Hello @cowtowncoder, Kind regards, |
@micw No I don't think I will have time to work on this anytime soon. Person who has worked on similar things lately is @pjfanning (not saying he'd have time or interest, but that he is probably quite familiar with code by now). |
Honestly, I simply don't understand the need to support cases that are disallowed by the JSON spec - this change would be low priority for me. |
@pjfanning Fair enough. I think majority of these requests are for various JSON-like variants like something called "JSON5" (see #734), and may make slightly more sense. But I agree that it's sort of slippery slope. |
unquoted keys are one thing, unquoted values seem to border on the ridiculous - You would need to restrict all values to not have the "," character, or set the end of key-value pair value to anything.... which many, MANY, will - |
I'm not sure this is allowed by json5 - see https://json5.org/ - Strings can be single or double quoted but need some sort of quotes - are you said above, without quotes, it is hard to spot the end of the string. In the example from the OP, you would need to trim the string - so you would probably need 2 deserialization features - one to support having no quotes and the other to support trimming the results. |
My use case is to process lot of json from sources not under my controll. In particular, I read statistics from "freifunk" nodes (free, community driven wifi mesh nodes) to combine it to a map and load it into grafana. It used to happen that the script generated JSON has some missing quotes. Especially when a script that is expected to return a number returns an error message instead ;-) I agree that this is a rare use case and the better option would be to fix the script. But on the other hand, there is an implementation with a different JSON parser (need to figure out which one) which does not fail on this issue ;-) |
@GedMarc Yeah. YAML is pretty problematic when "no" means @pjfanning Ok. I wasn't sure if this was specifically related to JSON5; good to know it is not. |
@micw Thank you for sharing more details of your use case. Yes, such "JSON" should be fixed if and when it is a flow in generation and not following some JSON-like alternative format. But just to be clear: any JSON parser that accepts such values is technically non-compliant. Jackson strictly enforces JSON specification unless instructed to allow certain deviations. So even if GSON did alllow this (does it really? :-o ), or |
Yes, GSON and org.json really do this. Here are a couple of tests (BTW, I personally don't really care, I think it's silly that they actually do this, I only opened the issue originally to kind of document the discrepancy: @Test
void orgJsonUnquotedvalues() {
JSONObject o = new JSONObject("{\"foo\": bar, \"baz\": qux}");
assertEquals("bar", o.getString("foo"));
}
@Test
void gsonUnquotedValues() {
Gson gson = new Gson();
JsonObject e = gson.fromJson("{\"foo\": bar, \"baz\": qux}", JsonObject.class);
assertEquals("bar", e.get("foo").getAsString());
}
|
Ok thank you for confirming @ryber. Interesting. Filing an issue (be it for documentation or having option for strict handling) makes sense. |
Interestingly, org.json will let you have spaces in the value, but GSON will not. |
Yes. Once you leave the spec path, results might be surprising ;-) |
Hi, guys. This is an essential feature for us! {
type: namespace.Identifier,
subtype: Enum/Type,
value: "...",
} Can I somehow extend the existing Jackson (something like a hook)? Or can it only be implemented in the core? |
Oh, I see, it is hardcoded here jackson-core/src/main/java/com/fasterxml/jackson/core/json/ReaderBasedJsonParser.java Line 929 in 8093f43
|
Yes, there are no hooks as this really is a low-level tokenization aspect and for performance reason that is not modular. Having said that, there are a few other opt-in settings so if someone was to tackle this, it would be considered for inclusion. |
A pluggable tokenizer with implementation for "strict" and "lax" could do it and should not affect performance. |
It might be faster to use https://github.com/marhali/json5-java and if you need to integrate with other Jackson modules, you could write a class that implements the Jackson JsonParser API and that delegates to the json5-java parser. This could be done in a 'dataformat' module - a bit like how https://github.com/FasterXML/jackson-dataformat-xml delegates the parser/generator work to woodstox. |
I don't think so: adding true pluggability itself would almost certainly add measurable overhead if done within jackson-core. Implementing alternate handling with switches (as is done for other aspect) is different story and does not need to add significant overhead. As per @pjfanning's suggestion, yes, implementing alternate format backend would make sense for anyone wanting to tackle this: this is kind of pluggability Jackson already supports quite well. |
Relevant for predibase/lorax#392, I would be interested in a Jackson option that allowed parsing this invalid JSON:
as if it were this valid JSON:
by treating the first character after My first preference is to fix the code creating this invalid JSON to create valid JSON instead of using a Jackson option, but if a Jackson option did exist I would be using it in the interim until the JSON generation is fixed. Sharing as a data point. |
In both Gson and org.json a unquoted value will be interpreted as a string. So given this array (note that apple is not quoted):
The resulting ArrayNode equivalent in those frameworks (JsonArray and JSONArray respectfully) will interpret it as
I have a OSS utility (Unirest-Java), that will allow users to bring their own favorite Json parsing engine. So I have an abstraction for them, and trying to make them operate with the same rules. This one I can't deal with unless Jackson were to handle it. I noticed you have a JsonReadFeature for ALLOW_UNQUOTED_FIELD_NAMES, it would be super cool if there was a ALLOW_UNQUOTED_VALUES
Thanks for all the hard work. Jackson is a fantastic library and a real backbone of the world of Java you should be really proud of it!
The text was updated successfully, but these errors were encountered: