Skip to content

Conversation

xinrong-meng
Copy link
Member

What changes were proposed in this pull request?

Bools |/&/^ None should fail under ANSI, following native pandas

For example,

>>> pd.Series([True, False]) | None
Traceback (most recent call last):
...
TypeError: unsupported operand type(s) for |: 'bool' and 'NoneType'

but under ANSI

>>> ps.Series([True, False]) | None
0    False                                                                      
1    False
dtype: bool

Why are the changes needed?

Part of https://issues.apache.org/jira/browse/SPARK-53389

Does this PR introduce any user-facing change?

No, the feature hasn't been released yet.

Now bools |/&/^ None fails under ANSI, e.g.

>>> ps.Series([True, False]) | None
Traceback (most recent call last):
...
TypeError: OR can not be applied to given types.

How was this patch tested?

Unit tests

Commands below passed:

 1027  SPARK_ANSI_SQL_MODE=true ./python/run-tests --python-executables=python3.11 --testnames "pyspark.pandas.tests.data_type_ops.test_boolean_ops BooleanOpsTests"
 1028  SPARK_ANSI_SQL_MODE=false ./python/run-tests --python-executables=python3.11 --testnames "pyspark.pandas.tests.data_type_ops.test_boolean_ops BooleanOpsTests"

Was this patch authored or co-authored using generative AI tooling?

No

@@ -237,6 +237,12 @@ def rmod(self, left: IndexOpsLike, right: Any) -> SeriesOrIndex:

def __and__(self, left: IndexOpsLike, right: Any) -> SeriesOrIndex:
_sanitize_list_like(right)
if (
is_ansi_mode_enabled(left._internal.spark_frame.sparkSession)
and self.dtype == bool
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to check self.dtype == bool because

>>> pd.Series([True, False, np.nan]) | None
0    False
1    False
2    False

>>> ps.Series([True, False, np.nan]).dtype
dtype('O')
>>> ps.Series([True, False, np.nan]).spark.data_type
BooleanType()

BooleanType() will direct the code path here, but we want to support (instead of TypeError) this case, so we check dtype explicitly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xinrong-meng
Copy link
Member Author

@HyukjinKwon @ueshin @zhengruifeng may I get a review please?

@xinrong-meng xinrong-meng requested a review from ueshin September 4, 2025 18:33
@xinrong-meng
Copy link
Member Author

Merged to master, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants