Skip to content

Conversation

@isVoid
Copy link
Contributor

@isVoid isVoid commented Jan 13, 2026

Today, calling conventions are defined globally per compilation context. This makes it hard to switch flexibly between the Numba ABI and the C ABI when declaring external functions. It also explains the need for the kernel “fixup” logic: CUDA kernels are fundamentally C-ABI, but have historically been forced through the Numba ABI path.

This PR moves calling-convention selection to a more granular level, removing these limitations and eliminating the kernel fixup workaround. It also lays the groundwork for users to plug in additional calling-convention implementations in the future.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Jan 13, 2026

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 13, 2026

Greptile Overview

Greptile Summary

This PR successfully refactors the calling convention architecture by moving CallConv from CUDAContext to FunctionDescriptor, enabling per-function calling convention specification. This architectural improvement allows different functions to use different ABIs (Numba vs C) and eliminates the need for kernel "fixup" logic.

Key Changes:

  • Moved CUDACallConv and CUDACABICallConv classes from target.py to callconv.py
  • Added call_conv and abi_info fields to FunctionDescriptor and CUDAFlags for per-function calling convention storage
  • Changed CUDATargetContext.call_conv from cached property to property reading from fndesc.call_conv
  • Updated declare_device to accept abi parameter ("numba" or "c")
  • Removed cabi_wrap_function as C ABI functions now work directly without wrapping
  • Added comprehensive tests for C ABI device functions with 0, 1, 2 arguments and void return

Critical Issue Found:

  • Bug in imputils.py line 199: The fix_returning_optional function accesses status.is_none without checking if status is None first. When using C ABI calling convention, CUDACABICallConv.call_function returns status = None (line 417 in callconv.py), which will cause an AttributeError when the code tries to access status.is_none.

Confidence Score: 2/5

  • This PR contains a critical bug that will cause runtime errors when using C ABI functions with optional return types
  • The architectural refactoring is well-designed and most changes are clean, but there is a critical bug in imputils.py line 199 that will cause AttributeError when calling C ABI device functions that have optional return types. The fix_returning_optional function assumes status is always an object with an is_none attribute, but for C ABI functions, status is None.
  • numba_cuda/numba/cuda/core/imputils.py requires immediate attention - the bug on line 199 must be fixed before merge

Important Files Changed

File Analysis

Filename Score Overview
numba_cuda/numba/cuda/core/imputils.py 2/5 Changed to use fndesc.call_conv instead of context.call_conv and added null check for status. Contains critical bug: fix_returning_optional accesses status.is_none without checking if status is None first (line 199).
numba_cuda/numba/cuda/core/callconv.py 5/5 Added CUDACallConv and CUDACABICallConv classes moved from target.py. Clean implementation of two calling conventions with proper separation of concerns.
numba_cuda/numba/cuda/core/funcdesc.py 5/5 Added call_conv and abi_info fields to FunctionDescriptor for per-function calling convention storage. Changes enable ABI name handling via abi_info.
numba_cuda/numba/cuda/target.py 5/5 Removed CUDACallConv and CUDACABICallConv classes (moved to callconv.py). Changed call_conv from cached property to property reading from fndesc.call_conv.
numba_cuda/numba/cuda/compiler.py 4/5 Removed cabi_wrap_function (no longer needed). Updated compile_cuda to accept abi parameter and set calling convention in flags. Updated declare_device_function to instantiate appropriate calling convention based on ABI.
numba_cuda/numba/cuda/core/compiler.py 4/5 Added import for CUDACallConv and updated error model creation to pass a CUDACallConv instance instead of context.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile


@skip_under_cuda_memcheck("Hangs cuda-memcheck")
@skip_on_cudasim("disabled for cudasim")
@unittest.skip
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test skipped without explanation - check if the refactoring broke this test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the multithread / multigpu test were suspected to be not tested properly in the current CI system as they are all single-GPU runners. They show up as failures in my local environment but was not related this PR.

Comment on lines 1129 to 1133
)
resty = cres.signature.return_type

if resty and not device and resty != types.void:
raise TypeError("CUDA kernel must have void return type.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed cabi_wrap_function call for device functions with C ABI. Verify this was intentional and that device functions with abi="c" still work correctly without the wrapper.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is indeed intentional, because function / kernels can bare its own calling convention, rather than using a target-wide calling convention. cabi_wrap_function was meant to patch a function with Numba ABI to a caller that expects a CABI function. The fix in this PR is imply switching the kernel to bare a CUDACABICallConv and should work as expected.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 13, 2026

Additional Comments (1)

numba_cuda/numba/cuda/core/imputils.py
Bug: status can be None when using C ABI calling convention (see line 222 check and CUDACABICallConv.call_function which returns status = None), but this line tries to access status.is_none, which will raise AttributeError.

        if status is not None and builder.not_(status.is_none):

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 13, 2026

Additional Comments (1)

numba_cuda/numba/cuda/core/imputils.py
status can be None when using C ABI (see line 222 check and CUDACABICallConv.call_function returns status = None), causing AttributeError when accessing status.is_none

        if status is not None and builder.not_(status.is_none):

@isVoid isVoid changed the title Move CallConv from CUDAContext to FunctionDescriptor Move CallConv from CUDAContext to FunctionDescriptor Jan 13, 2026
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 13, 2026

Additional Comments (1)

numba_cuda/numba/cuda/core/imputils.py
status can be None when using C ABI calling convention (see line 222 check and CUDACABICallConv.call_function which returns status = None in callconv.py line 417), causing AttributeError when accessing status.is_none

        if status is not None and builder.not_(status.is_none):

@gmarkall
Copy link
Contributor

/ok to test

@gmarkall gmarkall added the 3 - Ready for Review Ready for review by team label Jan 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3 - Ready for Review Ready for review by team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants