Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-50700][SQL] spark.sql.catalog.spark_catalog supports builtin magic value #49332

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

pan3793
Copy link
Member

@pan3793 pan3793 commented Dec 30, 2024

What changes were proposed in this pull request?

This PR adds a magic value builtin(and sets it to the default value) for spark.sql.catalog.spark_catalog.

Why are the changes needed?

Currently, spark.sql.catalog.spark_catalog is optional and has None as the default value. When spark.sql.catalog.spark_catalog=a.bad.catalog.impl is wrongly set in spark-defaults.conf, the user has no way to overwrite it in spark-submit.

Note that, explicitly setting it to o.a.s.sql.execution.datasources.v2.V2SessionCatalog does not work either, because V2SessionCatalog does not have a zero-args constructor.

spark-submit \
  --conf spark.sql.catalog.spark_catalog=org.apache.spark.sql.execution.datasources.v2.V2SessionCatalog \
  ...

To fix the above issue, similar to what we did for spark.sql.hive.metastore.jars, just use "builtin" to represent the built-in V2SessionCatalog.

Does this PR introduce any user-facing change?

No change for default behavior, and users are allowed to use spark.sql.catalog.spark_catalog=builtin to set spark_catalog as the built-in V2SessionCatalog.

How was this patch tested?

Code in UTs like

// unset this config to use the default v2 session catalog.
spark.conf.unset(V2_SESSION_CATALOG_IMPLEMENTATION.key)

are replaced with

spark.conf.set(V2_SESSION_CATALOG_IMPLEMENTATION, "builtin")

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label Dec 30, 2024
@pan3793
Copy link
Member Author

pan3793 commented Dec 30, 2024

@cloud-fan @yaooqinn @huaxingao please let me know if this approach makes sense, or do you have any other suggestions to allow users to set spark_catalog to the built-in V2SessionCatalog explicitly? thanks in advance.

Copy link
Member

@yaooqinn yaooqinn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense to me.

.version("3.0.0")
.stringConf
.createOptional
.createWithDefault("builtin")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we normalize it to lower case? e.g. .transform(_.toLowerCase(Locale.ROOT))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for suggestion, updated

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry I'm wrong... It's class name so we can't lower case it.

Maybe the string comparisons should be case insensitive.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, right, let me change

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan how about now?

Copy link
Contributor

@cloud-fan cloud-fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense to me

@pan3793
Copy link
Member Author

pan3793 commented Jan 3, 2025

kindly ping @cloud-fan, can we get this in?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants