Skip to content

Conversation

@samithreddychinni
Copy link

Fixes #20526

Description

This PR addresses a hash collision issue in CPyCppyy::HashSignature that caused incorrect template instantiation caching in PyROOT.

Analysis

The HashSignature function previously calculated the signature hash for CPPOverload objects based solely on Py_TYPE and Py_REFCNT.

  • The Issue: Distinct global functions (e.g., gbl.plus_one and gbl.add) are often wrapped as CPPOverload objects with identical reference counts.
  • The Consequence: These distinct functions generated identical hash signatures. The TemplateProxy dispatch cache would then incorrectly return a cached instantiation for the wrong function (e.g:- executing plus_one logic when add was called), leading to runtime errors or incorrect results.

The Fix

I updated CPPOverload.h to incorporate the object identity (pointer address) into the hash calculation for CPPOverload and TemplateProxy types.

  • Used golden ratio bit mixing to ensure entropy when mixing the pointer address into the hash.
  • This ensures that distinct C++ function wrappers produce unique signatures, preventing cache collisions.

Verification

I reproduced the issue using the script provided in #20526.

  • Before: Both functions produced sighash=9088374328679265724.
  • After:
    • get_one: 469779200852894291
    • plus_one: 469780645212860978
  • Result: The reproduction script now runs successfully without segfaults or type mismatches.

Note on Overload Resolution

During debugging, I observed that the tpp_vectorcall mechanism involves speculative execution of candidates during the overload resolution cascade. While this PR fixes the correctness issue (the hash collision), the architectural behavior of multiple invocations during resolution remains unchanged and is outside the scope of this fix.

…ject#20526)

Distinct CPPOverload objects (e.g:- different global functions) previously
generated identical hash signatures because HashSignature relied solely on
Py_TYPE and Py_REFCNT. This caused the TemplateProxy to retrieve incorrect
cached instantiations.

This patch adds object identity (pointer address) to the hash calculation
for CPPOverload and TemplateProxy types using golden ratio bit mixing.

Fixes root-project#20526.
Copy link
Contributor

@Vipul-Cariappa Vipul-Cariappa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this will end up with 100% cache misses when overloads are involved, as overload instances may not be the same (i.e. different pointer addresses).

Example:

In [3]: cppyy.cppdef(r"""void f() {}""")
Out[3]: True

In [4]: gbl.f is gbl.f
Out[4]: False

A better approach would be to use CPPOverload::fMethodInfo::fName and hash that.

Could you also add a test case in test_templates.py similar to the reproducer from the issue you linked.

The actual issue of multiple invocations would be resolved with the migration. But this fix of hashing is useful. Thanks for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Python] Templated function invoked multiple times with incorrect template arguments

2 participants