Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle edge cases when converting double from JSON to Trino types #24030

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Pluies
Copy link
Contributor

@Pluies Pluies commented Nov 4, 2024

Description

Currently, we are unable to optimise tables when they contain infinity/NaN values in fields of type "DOUBLE" because the conversion fails when we try to cast a string containing "Infinity", "-Infinity" or "NaN" into a double.

Instead, we try to check if the JSON value contains a string, and convert it appropriately.

Additional context and related issues

Fixes #24029

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

## Delta Lake
* Fix silent failure to write checkpoint files for tables containing double values Infinity or NaN. ({issue}`24029`)

This comment was marked as resolved.

@github-actions github-actions bot added the delta-lake Delta Lake connector label Nov 4, 2024
Currently, we are unable to optimise tables when they contain infinity/NaN
values in fields of type "DOUBLE" because the conversion fails when we try
to cast a string containing "Infinity", "-Infinity" or "NaN" into a double.

Instead, we try to check if the JSON value contains a string, and convert it
appropriately.
@Pluies Pluies force-pushed the delta-checkpoint-support-infinity branch from 11f405a to bf16c90 Compare November 4, 2024 19:38
@cla-bot cla-bot bot added the cla-signed label Nov 4, 2024
@Pluies Pluies requested review from ebyhr and findepi November 4, 2024 19:38
@Pluies
Copy link
Contributor Author

Pluies commented Nov 4, 2024

Tagging @ebyhr as the original author and @findepi as the original commentor of this prescient review comment 😄

@Pluies
Copy link
Contributor Author

Pluies commented Nov 4, 2024

Note that this is mostly a first draft for inputs & review, we'll need to add tests, but I wanted to first check that you agree with the approach and check if you had a better way to do it.

@ebyhr
Copy link
Member

ebyhr commented Nov 4, 2024

@Pluies This approach looks good to me. Could you please start adding tests?

Comment on lines +130 to +131
if (jsonValue instanceof String) {
switch ((String) jsonValue) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (jsonValue instanceof String) {
switch ((String) jsonValue) {
if (jsonValue instanceof String stringValue) {
switch (stringValue) {

case "-Infinity" -> {
return Double.NEGATIVE_INFINITY;
}
case "NaN" -> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can it be nan or NAN?

@@ -127,6 +127,20 @@ public static Object jsonValueToTrinoValue(Type type, @Nullable Object jsonValue
return (long) floatToRawIntBits((float) (double) jsonValue);
}
if (type == DOUBLE) {
if (jsonValue instanceof String) {
switch ((String) jsonValue) {
case "Infinity" -> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can it be inf (besides case differences)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed delta-lake Delta Lake connector
Development

Successfully merging this pull request may close these issues.

Delta Lake: Trino fails to write checkpoint on tables containing Infinity
3 participants