Skip to content

[IR Container] Phase 2.4 Per-fusion statement tracking #5961

Open
mdavis36 wants to merge 2 commits intomd/phase2-shared-ptrfrom
md/phase2-per-fusion
Open

[IR Container] Phase 2.4 Per-fusion statement tracking #5961
mdavis36 wants to merge 2 commits intomd/phase2-shared-ptrfrom
md/phase2-per-fusion

Conversation

@mdavis36
Copy link
Collaborator

@mdavis36 mdavis36 commented Feb 12, 2026

Summary

Add per-Fusion ownership tracking maps to IrContainer so each Fusion can efficiently query only its own Vals and Exprs within a shared container. Update all Fusion-level accessors to filter by ownership.

This is the highest-risk change in the Phase 2 chain. Every accessor call path is touched — vals(), deterministic_vals(), deterministic_exprs(), unordered_exprs(), and more now return ownership-filtered results rather than raw container contents. For single-Fusion containers the results are identical, but the implementation changes underneath every consumer.

Relationship to Phase 2

This is the critical invariant that makes shared containers safe:

Invariant: Fusion accessors filter by ownership

  Fusion A ─┐
             ├──→ shared_ptr<IrContainer> ──→ {val_0(A), val_1(A), val_0'(B), val_1'(B)}
  Fusion B ─┘

  A.vals()  →  {val_0, val_1}       // Only A's vals
  B.vals()  →  {val_0', val_1'}     // Only B's vals
  container->vals()  →  {val_0, val_1, val_0', val_1'}  // ALL vals (raw)

Without per-Fusion filtering, a shared container copy would break every consumer — Fusion::copy would iterate all vals (including other Fusions'), deterministic_vals() would return interleaved results, and StatementGuard rollback would destroy statements belonging to other Fusions.

This invariant is what allows Phase 2 to maintain independent IR graphs despite shared storage:

  • Invariant : fusion->vals() returns {v : v in container AND v->container() == fusion}
  • Invariant : Fusion::clear() only clears THIS Fusion's state (not ir_container()->clear())

CI Risk

Highest of all Phase 2 PRs. Every accessor-dependent code path is touched. For single-Fusion containers, filtered results are identical to unfiltered — but regressions would surface in accessor-heavy code paths (scheduling, lowering, codegen).

@mdavis36 mdavis36 changed the base branch from main to md/phase2-shared-ptr February 12, 2026 22:08
@github-actions
Copy link

github-actions bot commented Feb 12, 2026

Review updated until commit bc595c5

Description

  • Add per-Fusion ownership tracking maps (per_fusion_vals_, per_fusion_exprs_) to IrContainer for efficient per-Fusion statement queries

  • Update all Fusion accessors (vals(), deterministic_vals(), unordered_exprs(), etc.) to return ownership-filtered results instead of raw container contents

  • Modify Fusion::clear() to use removeStatementsOwnedBy() instead of ir_container()->clear() for proper per-Fusion cleanup

  • Implement StatementGuard LIFO rollback using per-Fusion counts and add ownership verification assertions for shared container safety

Changes walkthrough

Relevant files
Enhancement
fusion.cpp
Core Fusion per-Fusion tracking integration                           

csrc/fusion.cpp

  • Update Fusion::swap() to transfer statement ownership between
    containers after swapping contents
  • Modify Fusion::clear() to use
    ir_container_->removeStatementsOwnedBy(this) instead of clearing
    entire container
  • Update removeExpr() and removeVal() to maintain per_fusion tracking
    maps
  • Fix removeStatementsCreatedAfter() to use per-Fusion counts and add
    ownership verification for LIFO rollback
  • Update registerVal() and registerExpr() to populate per_fusion
    tracking maps
  • +25/-27 
    container.cpp
    IrContainer per-Fusion ownership implementation                   

    csrc/ir/container.cpp

  • Add per_fusion_vals_ and per_fusion_exprs_ member variables for
    ownership tracking
  • Implement valsOwnedBy() and exprsOwnedBy() accessors for per-Fusion
    queries
  • Add transferStatementOwnership() for Fusion swap operations
  • Implement removeStatementsOwnedBy() with O(n) std::erase_if
    optimization
  • Add deterministicValsOwnedBy(), deterministicExprsOwnedBy() and
    corresponding map methods
  • Update swap() and clear() to handle per-Fusion tracking state
  • +131/-0 
    fusion.h
    Fusion interface per-Fusion accessor updates                         

    csrc/fusion.h

  • Update deterministic_vals(), deterministic_exprs(), and unordered
    accessors to return per-Fusion filtered results
  • Convert numExprs() and numVals() to return per-Fusion counts instead
    of global container counts
  • Add numValsExcludingShortcuts() method for proper StatementGuard LIFO
    operations
  • Update all accessor method signatures to filter by Fusion ownership
  • +26/-15 
    container.h
    IrContainer interface per-Fusion tracking declarations     

    csrc/ir/container.h

  • Add per_fusion_vals_ and per_fusion_exprs_ member variables for
    ownership tracking
  • Declare valsOwnedBy(), exprsOwnedBy() and transferStatementOwnership()
    methods
  • Add removeStatementsOwnedBy() method for per-Fusion cleanup
  • Declare deterministicValsOwnedBy(), deterministicExprsOwnedBy() and
    corresponding map methods
  • +18/-0   
    Bug fix
    statement_guard.cpp
    StatementGuard per-Fusion count tracking                                 

    csrc/statement_guard.cpp

  • Update StatementGuard constructor to use numValsExcludingShortcuts()
    for proper LIFO count tracking
  • Ensure StatementGuard rollback operates on per-Fusion statement counts
    in shared containers
  • +1/-1     

    PR Reviewer Guide

    Here are some key observations to aid the review process:

    🧪 PR contains tests
    ⚡ Recommended focus areas for review
    Ownership Tracking Synchronization

    The per-Fusion tracking maps (per_fusion_vals_ and per_fusion_exprs_) must be kept in perfect sync with container operations. The registerVal/registerExpr functions properly update these maps, but reviewers should verify that all code paths that modify vals_up_/exprs_up_ also update the corresponding per-Fusion maps. The removeStatementsCreatedAfter function shows ownership validation, which is good defensive programming.

      c->vals_up_.emplace_back(val);
      c->vals_.insert(val);
      c->per_fusion_vals_[this].insert(val);
      val->setName(IrContainerPasskey(), c->getValName(val->vtype()));
    }
    
    void Fusion::registerExpr(Expr* expr) {
      if (inContainer(expr)) {
        return;
      }
    
      if (expr->fusion()) {
        NVF_CHECK(
            expr->fusion() == this, expr, " was not found in the active fusion.");
      }
    
      auto* c = ir_container();
      c->exprs_up_.emplace_back(expr);
      c->exprs_.insert(expr);
      c->per_fusion_exprs_[this].insert(expr);
      expr->setName(IrContainerPasskey(), c->getExprName());
    
    Fusion Clear Semantics Change

    Fusion::clear() now calls ir_container_->removeStatementsOwnedBy(this) instead of ir_container()->clear(). This is a significant semantic change - previously it would clear ALL statements from the container, now it only clears this Fusion's statements. This aligns with the Phase 2 invariants but reviewers should verify this doesn't break any assumptions about container state after clear() calls.

    void Fusion::clear() noexcept {
      // Perf scope isn't safe here as this function could be called by
      // the Fusion destructor and the scope initializer could call the
      // constructor of Trace, which could throw an exception.
      // FUSER_PERF_SCOPE("Fusion clear");
    
      if (ir_container_) {
        ir_container_->removeStatementsOwnedBy(this);
      }
    StatementGuard Shortcut Handling

    The numValsExcludingShortcuts() function was introduced to handle special values (zero_val_, one_val_, etc.) that should persist across StatementGuard scopes. The removeStatementsCreatedAfter function now uses this count instead of the total val count. Reviewers should verify that shortcut values are properly handled and that the LIFO removal logic correctly skips over these persistent values.

    while (numValsExcludingShortcuts() > num_vals_before) {
      Val* v = c->vals_up_.back().get();
      NVF_ERROR(
          c->per_fusion_vals_[this].count(v) > 0,
          "removeStatementsCreatedAfter: tail val belongs to another Fusion");
      // Null out shortcut caches if they point to vals about to be destroyed
      if (v == zero_val_) {
        zero_val_ = nullptr;
      } else if (v == one_val_) {
        one_val_ = nullptr;
      } else if (v == true_val_) {
        true_val_ = nullptr;
      } else if (v == false_val_) {
        false_val_ = nullptr;
      } else if (v == magic_zero_val_) {
        magic_zero_val_ = nullptr;
      }
      c->per_fusion_vals_[this].erase(v);
      c->vals_.erase(v);
      c->vals_up_.pop_back();
    }

    @mdavis36 mdavis36 force-pushed the md/phase2-shared-ptr branch from 3a199c8 to 53e5045 Compare February 12, 2026 22:09
    @mdavis36
    Copy link
    Collaborator Author

    !test

    @mdavis36 mdavis36 changed the title [IR Container] Phase 2 Per-fusion statement tracking [IR Container] Phase 2.4 Per-fusion statement tracking Feb 18, 2026
    @mdavis36 mdavis36 force-pushed the md/phase2-per-fusion branch from 33629cb to 8b162d9 Compare February 18, 2026 03:13
    @mdavis36 mdavis36 force-pushed the md/phase2-shared-ptr branch from 53e5045 to f8ff364 Compare February 18, 2026 03:13
    @mdavis36
    Copy link
    Collaborator Author

    !test

    @mdavis36 mdavis36 marked this pull request as ready for review February 18, 2026 06:37
    @greptile-apps
    Copy link
    Contributor

    greptile-apps bot commented Feb 18, 2026

    Greptile Summary

    This PR implements per-Fusion ownership tracking by adding per_fusion_vals_ and per_fusion_exprs_ maps to IrContainer, allowing each Fusion to efficiently query only its own statements within a shared container. All Fusion accessor methods (vals(), deterministic_vals(), exprs(), etc.) now return ownership-filtered results instead of raw container contents.

    Key Changes:

    • Added ownership tracking infrastructure in IrContainer with methods like valsOwnedBy(), exprsOwnedBy(), and removeStatementsOwnedBy()
    • Updated all Fusion accessors to filter by ownership, maintaining the invariant that fusion->vals() returns only statements where statement->fusion() == fusion
    • Modified Fusion::clear() to call removeStatementsOwnedBy(this) instead of clearing the entire container
    • Updated Fusion::swap() to transfer ownership tracking after swapping container contents
    • Added numValsExcludingShortcuts() to handle singleton shortcut values that persist across StatementGuard scopes
    • Modified statement registration/removal methods to maintain per-Fusion tracking maps

    Risk Assessment:
    This is the highest-risk PR in the Phase 2 chain as noted in the description. Every accessor-dependent code path is touched, though for single-Fusion containers (the current common case), the filtered results are identical to unfiltered results. The implementation changes underneath all consumers, introducing performance overhead from ownership filtering on every accessor call.

    Confidence Score: 3/5

    • High-risk refactor touching all accessor paths; functionally correct for single-Fusion containers but with known limitations
    • Score reflects the broad scope of changes (every accessor path modified), potential performance overhead from ownership filtering, and known edge case issues with Fusion::swap when containers are shared. The code is logically sound for the current single-Fusion case, but the architectural changes introduce risk across scheduling, lowering, and codegen paths. Testing coverage is critical here.
    • Pay close attention to csrc/fusion.cpp (swap and removeStatementsCreatedAfter methods) and csrc/ir/container.cpp (ownership filtering implementations)

    Important Files Changed

    Filename Overview
    csrc/ir/container.cpp Implements ownership filtering methods and statement removal logic with std::erase_if
    csrc/fusion.h Updates all accessor signatures to return ownership-filtered results, adds numValsExcludingShortcuts for StatementGuard
    csrc/fusion.cpp Updates swap, clear, register, remove, and rollback methods to maintain per-Fusion ownership invariants

    Flowchart

    %%{init: {'theme': 'neutral'}}%%
    flowchart TD
        A[Fusion A] -->|owns| IC[IrContainer shared_ptr]
        B[Fusion B] -->|owns| IC
        IC -->|contains| VD["vals_up_ deque<br/>global: v0, v1, v2, v3"]
        IC -->|contains| ED["exprs_up_ deque<br/>global: e0, e1, e2, e3"]
        IC -->|tracks| PFV["per_fusion_vals_<br/>A: v0, v1<br/>B: v2, v3"]
        IC -->|tracks| PFE["per_fusion_exprs_<br/>A: e0, e1<br/>B: e2, e3"]
        
        A -->|"vals() call"| FA[valsOwnedBy A]
        FA -->|filters by A| PFV
        FA -->|"returns"| RAV["only v0, v1"]
        
        B -->|"vals() call"| FB[valsOwnedBy B]
        FB -->|filters by B| PFV
        FB -->|"returns"| RBV["only v2, v3"]
        
        style IC fill:#e1f5ff
        style PFV fill:#fff4e1
        style PFE fill:#fff4e1
        style RAV fill:#d4edda
        style RBV fill:#d4edda
    
    Loading

    Last reviewed commit: bc595c5

    Copy link
    Contributor

    @greptile-apps greptile-apps bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    4 files reviewed, 5 comments

    Edit Code Review Agent Settings | Greptile

    @greptile-apps
    Copy link
    Contributor

    greptile-apps bot commented Feb 18, 2026

    Additional Comments (2)

    csrc/fusion.cpp
    Shared-container safety gap in rollback

    In removeStatementsCreatedAfter, statements are popped from the back of the container-global exprs_up_ deque (line 445), but only the current Fusion's per-fusion map is cleaned up (line 449: c->per_fusion_exprs_[this].erase(e)).

    In a shared container scenario where another Fusion has appended statements after the StatementGuard snapshot, the statement at the back of exprs_up_ could belong to the other Fusion. This code would:

    1. Destroy the unique_ptr (freeing the Expr memory)
    2. Remove it from exprs_ (global set)
    3. Only erase from per_fusion_exprs_[this] — leaving a dangling pointer in the other Fusion's per-fusion set

    The same issue applies to the vals_up_ loop below (line 467).

    This is safe for single-Fusion containers today, but since this PR's stated goal is enabling shared-container safety, the rollback path should also clean up the correct owner's per-fusion map — e.g., by looking up which Fusion actually owns each statement being removed.


    csrc/fusion.h
    numExprs/numVals still return container-global counts

    numExprs() and numVals() forward directly to the container's total counts, not per-Fusion counts. This is inconsistent with the fact that vals(), unordered_exprs(), and deterministic_* now return per-Fusion filtered results.

    More importantly, StatementGuard uses numExprs() and numVals() to snapshot statement counts and later passes them to removeStatementsCreatedAfter, which pops from the global deques. In a shared container where another Fusion adds statements concurrently, the snapshot will include the other Fusion's additions, and the rollback will remove them — destroying statements that don't belong to this Fusion.

    For single-Fusion containers this is fine, but these methods should be updated to return per-Fusion counts (or at least documented as container-global) before shared containers are actually used.

    Add per_fusion_vals_ / per_fusion_exprs_ maps to IrContainer so each
    Fusion can efficiently query only its own statements in a shared
    container. Fusion forwarding methods (vals(), unordered_exprs(),
    deterministic_vals(), etc.) now return per-Fusion filtered results.
    Fusion::clear() uses removeStatementsOwnedBy(this) instead of
    ir_container()->clear().
    @mdavis36 mdavis36 force-pushed the md/phase2-per-fusion branch from 8b162d9 to b8d202d Compare February 26, 2026 00:29
    @mdavis36
    Copy link
    Collaborator Author

    !test

    ## Summary
    
    Review fixes for PR #5961 (Per-Fusion statement tracking):
    
    - **O(n²) → O(n)**: Optimize `removeStatementsOwnedBy` with
    `std::erase_if`
    - **Per-Fusion counts**: Convert `numExprs()`/`numVals()` to return
    per-Fusion counts instead of global
    - **StatementGuard fixes**: Snapshot and compare per-Fusion counts for
    correct LIFO rollback in shared containers
    - **LIFO assertions**: Verify tail elements belong to this Fusion before
    popping
    
    ## Tests
    
    All tests pass:
    - ✅ StatementGuardTest.ExecuteAfterGuard
    - ✅ StatementGuardTest.LazySpecialValsNotDangling
    - ✅ FusionCopy_CUDA
    - ✅ FusionMove_CUDA
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Labels

    None yet

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    1 participant