Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unclear behavior of UDF NULL returns with DEFAULT null_handling #4154

Open
2 tasks done
mustafahasankhan opened this issue Nov 26, 2024 · 0 comments
Open
2 tasks done
Assignees
Labels
python [component] Python API

Comments

@mustafahasankhan
Copy link

What happens?

According to duckdb.org/docs/api/python/function.html#null-handling, if you pass a NULL into a python UDF, DuckDB will short-circuit and immediately resolve to NULL with DEFAULT null_handling. However, it's not documented that a Python UDF cannot itself return None/NULL when using DEFAULT null_handling.

This is unexpected because:

  • None is Python's equivalent of SQL NULL
  • The docs focus on input NULL handling but don't mention output NULL restrictions
  • Functions naturally return None in Python for invalid/empty cases

The current workaround is to either:

  • Never return None from UDF
  • Use null_handling="SPECIAL"

To Reproduce

import duckdb
from duckdb.typing import VARCHAR

def returns_none(x: str) -> str:
    if not x:  # Empty string is not NULL
        return None
    return x

conn = duckdb.connect(':memory:')

# Create test data
conn.execute("""
    CREATE TABLE test AS SELECT * FROM (
        VALUES 
            ('hello'),    -- Valid string
            (''),         -- Empty string (not NULL!)
            (NULL),       -- NULL
            ('world')     -- Valid string
    ) t(data)
""")

# This fails even though empty string is not NULL
conn.create_function("test_none", returns_none, [VARCHAR], VARCHAR)
conn.execute("""
    SELECT data, test_none(data) as result
    FROM test 
    WHERE data IS NOT NULL
""").fetchall()

# Error:
# Invalid Input Error: 
# The returned result contained NULL values, but the 'null_handling' was set to DEFAULT.
# If you want more control over NULL values then 'null_handling' should be set to SPECIAL.

OS:

macOS 14.5 (23F79)

DuckDB Version:

1.1.3

DuckDB Client:

Python

Hardware:

No response

Full Name:

Mustafa Hasan Khan

Affiliation:

Atlan

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have not tested with any build

Did you include all relevant data sets for reproducing the issue?

Yes

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

  • Yes, I have
@szarnyasg szarnyasg added the python [component] Python API label Mar 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python [component] Python API
Projects
None yet
Development

No branches or pull requests

3 participants