Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-50126][PYTHON][CONNECT][3.5] PySpark expr() (expression) SQL Function returns None in Spark Connect #49755

Open
wants to merge 1 commit into
base: branch-3.5
Choose a base branch
from

Conversation

the-sakthi
Copy link
Member

Cherry-pick #46583 to branch-3.5
Original JIRA: SPARK-48276

What changes were proposed in this pull request?

Original PR proposed these changes:

  • Add the missing __repr__ method for SQLExpression
  • Also adjust the output of lit(None): None -> NULL to be more consistent with the Spark Classic

I had to make very minor modifications to fix unclean cherry-pick:

  • The UT added in the original PR needed an import.

Why are the changes needed?

  • In Spark 3.5, when PySpark is launched with a remote Spark Connect configuration, calls to pyspark.sql.functions.expr incorrectly return Column<None> instead of the expected expression. This change addresses the issue to ensure proper expression resolution in Spark Connect mode.
  • As per original PR: [Bug fix] All expressions should implement the __repr__ method.

Does this PR introduce any user-facing change?

Yes, this PR ensures that pyspark.sql.functions.expr correctly resolves expressions in Spark Connect mode. Previously, it returned Column<None>, but now it behaves correctly.

How was this patch tested?

Manually tested. Also original PR added a UT.

Was this patch authored or co-authored using generative AI tooling?

No.

…unction returns None in Spark Connect

Cherry-pick apache#46583 to branch-3.5
Original JIRA: SPARK-48276
@the-sakthi the-sakthi marked this pull request as ready for review February 1, 2025 00:01
@the-sakthi
Copy link
Member Author

@zhengruifeng : Tagging you for review as the original author of this change.
@HyukjinKwon : Tagging you for review as the reviewer of the original PR. 🙂

@the-sakthi
Copy link
Member Author

the-sakthi commented Feb 1, 2025

Build github workflows failing above due to deprecation of v3 actions/upload-artifact.
https://issues.apache.org/jira/browse/SPARK-46474 had upgraded to v4 in master branch. Do we want to backport it to branch-3.5? I can take that task up if we are going via that route.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants