Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-37175][table] Support JSON built-in function for JSON_OBJECT #26022

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

gustavodemorais
Copy link
Contributor

What is the purpose of the change

It is currently not possible to declare a SQL string that contains existing JSON as valid JSON for JSON_OBJECT. Something like JSON_OBJECT(KEY 'K' VALUE '{"value": 42}') returns {"K", "{"value": 42}"}, where the value is a string a not a json object.

This PR adds support for the JSON() function. It's the initial support for it, until this function returns the JSON datatype (what we still don't have in flink).

Example:

jsonObject(JsonOnNull.NULL, "nested", json('{"value": 42}'))
JSON_OBJECT(KEY 'K' VALUE JSON('{"value": 42}'))
// {"nested":{"value":42}}

Brief change log

  • Add JSON BuiltInFunction definition
  • Update JsonObjectCallGen to support JSON function logic
  • Update expressions
  • Add tests for json function

Verifying this change

This change added tests for multiple uses cases of the function. Also added tests to make sure the fucnction is only called within JSON_OBJECT.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): yes
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? yes
  • If yes, how is the feature documented? docs, JavaDocs

@flinkbot
Copy link
Collaborator

flinkbot commented Jan 20, 2025

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

docs/data/sql_functions.yml Outdated Show resolved Hide resolved
@@ -861,9 +862,13 @@ public static ApiExpression withoutColumns(Object head, Object... tail) {
* jsonObject(JsonOnNull.ABSENT, "K1", nullOf(DataTypes.STRING())) // "{}"
*
* // {"K1":{"K2":"V"}}
* jsonObject(JsonOnNull.NULL, "K1", json('{"K2":"V"}'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* jsonObject(JsonOnNull.NULL, "K1", json('{"K2":"V"}'))
* jsonObject(JsonOnNull.NULL, "K1", json("{'K2':'V'}"))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or escaped? not sure how strict the user-defined JSON needs to be? maybe we should check and document this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The string doesn't have to be escaped, I've updated the javadoc for the java expression and added one example using string literals that show that there's no need to escape it

     * // {"K":{"K2":{"K3":42}}}
     * jsonObject(
     *         JsonOnNull.NULL,
     *         "K",
     *         json("""
     *                {
     *                  "K2": {
     *                    "K3": 42
     *                  }
     *                }
     *              """))

We escape in the java code since java also uses " to express strings. If using single quotes in python or string literals in java, there's no need to escape. If they're escaped, they are also processed properly as well and result in the same json object.

@gustavodemorais
Copy link
Contributor Author

gustavodemorais commented Jan 22, 2025

Made the changes as we discussed @twalthr. One additional thing I've changed is that we not only parse the json, but convert the json back to string before storing it. I think that makes sense, so we optimize the storage space by getting rid of unnecessary whitespaces/line breaks and so on before returning the value.

@gustavodemorais
Copy link
Contributor Author

@flinkbot run azure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants