-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support user-defined mapping for Inf and NaN via keyword arg #294
base: main
Are you sure you want to change the base?
Conversation
@quinnj aware that you're super busy and don't want to add any pressure. Just let us know if there's anything we can do to help you expedite this PR (it's really small) to get it out of your work queue. Thanks! |
If you review this, please consider my question from #292 (comment) |
@quinnj We're about releasing a new version of the GenieFramework and a solution to the Inf issue would be super welcome. |
The API seems reasonable. Performance of writing julia> @b rand([Inf, NaN, -Inf], 100) JSON3.write(_, allow_inf=true)
244.709 ns (3 allocs: 2.047 KiB) # release
496.429 ns (3 allocs: 2.047 KiB) # pr
julia> @b rand(100) JSON3.write(_, allow_inf=true)
2.488 μs (7 allocs: 6.594 KiB) # release
2.486 μs (7 allocs: 6.594 KiB) # pr |
In my tests I have a 388ns (#main) vs. 501ns (#hh-infinity2), which I would consider acceptable. |
…_inf_mapping` and `quoted_inf_mapping` plus docstrings and tests
Fixed the performance issue for default mappings and added julia> JSON3.write([Inf, -Inf, NaN], inf_mapping = JSON3.quoted_inf_mapping) |> println
["Infinity","-Infinity","NaN"]
julia> JSON3.write([Inf, -Inf, NaN], inf_mapping = JSON3.underscore_inf_mapping) |> println
["__inf__","__neginf__","__nan__"] EDIT: updated code example |
We could also provide a hacky_inf_mapping(x) = x == Inf ? "1e1000" : x == -Inf ? "-1e1000" : "\"__nan__\"" which would correctly generate Infinity and -Infinity correctly for typical browser implementations of deserialisation. |
Adding |
|
Definitely in the docs; and at that point it doesn't really matter much which. |
All done |
@@ -46,6 +46,11 @@ end | |||
@test JSON3.read("Inf"; allow_inf=true) === Inf | |||
@test JSON3.read("Infinity"; allow_inf=true) === Inf | |||
@test JSON3.read("-Infinity"; allow_inf=true) === -Inf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noting that it is not possible to support reading with a custom mapping defined with a function that maps Float64
to String
.
However, it is possible to support reading and writing with a @NamedTuple{positive_inf::AbstractString, negative_inf::AbstractString, nan::AbstractString}
API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree that not being able to read the JSON back in when a custom mapping is used is a bummer, but also it's something that is/will be better solved in JSONBase.jl, where it's easier to override reading things.
Actually, we do have the RawJson construct if you really needed to parse something back in; it's a bit of a heavy-handed escape hatch here, but technically would work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@quinnj, I think this PR is ready for final review.
The biggest drawbacks are
- Unable to parse back the emitted JSON
- Sub-optimal performance when emitting many NaNs or Infs using a custom mapping (does not effect
allow_inf=true
usage)
But I think it is likely still worth merging despite them because of the value this feature provides.
Sorry to be so slow on the review here; I've indeed been busy and am woefully behind on github notifications. Always feel free to give me a ping on slack, which is the only platform I stay up to date on these days. I'm going to start looking at this now. (also thanks to @LilithHafner for jumping in to help review here) |
If anyone has time to look at the CI failures, I'd appreciate it. I'm a bit busy, but could probably look in the next few days. |
CI failures look to be the same as the failures on main. |
Found a way to read inf_mappings, so please wait with merging |
I added support for inf_mapping in case that the mapping used string values and not other types, e.g. like julia> inf_mapping(x) = x == Inf ? "\"__inf__\"" : x == -Inf ? "\"__neginf__\"" : "\"__nan__\"";
julia> JSON3.read("""["__nan__", {"a": "__inf__"},["__neginf__"]]"""; inf_mapping)
3-element JSON3.Array{Any, Base.CodeUnits{UInt8, String}, Vector{UInt64}}:
NaN
{
"a": Inf
}
[-Inf] If this meets your expectation, I'll add docs and tests. |
One more thought: inf_mapping(x) = x == Inf ? "__inf__" : x == -Inf ? "__neginf__" : "__nan__" EDIT: Just realised that there is no JSONText in JSON3, so probably not a good idea. |
src/read.jl
Outdated
float = if val == codeunits(inf_mapping(Inf))[2:end-1] | ||
Inf | ||
elseif val == codeunits(inf_mapping(-Inf))[2:end-1] | ||
-Inf | ||
elseif val == codeunits(inf_mapping(NaN))[2:end-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This usage makes me think that inf_mapping
should be a Tuple or NamedTuple rather than a function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought so, too. But the function version was much faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried arrays, tuples and functions, at least concerning writing. I didn't check read performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also checked the RawType approach but I couldn't find out how to change the type to Float. The current approach looks more natural to me and has less code.
It is somewhat of a restriction that I only support the case of string mappings, but I think it is very untypical that people want to cover other values than Infinity and NaN if they have a process that allows to send non-standard JSON.
EDIT: it's easy to include the quotes just by expanding the view and leaving out the [2:end-1]
so my previous comment is no longer valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difference between function and tuple that is giving you that performance difference is methods are specialized on a function but not on a tuple's value. You could get similar performance with a tuple by lifting it to the type domain with Val
. I do see the advantage in terms of runtime performance of having the serialization format of inf and nan be passed into write/read at the type level
How do we proceed? Change to a tuple or named tuple? If so, could you paste a piece of code here? I'm not sure whether I understood how you did it. |
I don't have a piece of code to paste. I'm also not that invested in the API here; Happy to hear @quinnj's thoughts. |
I retried a tuple solution, which is only slightly inferior performancewise than the functional mapping. With fn_mapping(x::Real) = x == Inf ? "\"__inf__\"" : x == -Inf ? "\"__neginf__\"" : "\"__nan__\""
tuple_mapping = ("\"__inf__\"", "\"__neginf__\"", "\"__nan__\"")
x = rand([Inf, NaN, -Inf], 1000)
y = JSON3.write.(x, inf_mapping=fn_mapping)
jy = join(y, "\", \"") I obtain
My tuple implementation is available as branch hh-infinity-tuple. EDIT: updated table with values for |
If performance should be the criterion, we could define a macro |
This is a keyword-argument approach to implement #292 as an alternative to #293.