Perf: accelerate feature_to_block with torch_scatter #302

AsymmetryChou · 2025-12-26T04:48:29Z

Introduce methods to pre-compute scatter indices for node and edge features, enabling efficient conversion to block matrices.
This enhancement improves performance by leveraging vectorized operations.

Summary by CodeRabbit

Refactor
- Optimized feature-to-block conversion logic using vectorized scatter operations for improved performance.
- Enhanced internal mapping utilities to streamline data processing workflows.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-26T04:56:14Z

📝 Walkthrough

Walkthrough

The changes refactor Hamiltonian feature-to-block conversion logic from explicit Python loops to vectorized scatter-based operations, with supporting precomputation methods added to OrbitalMapper for index mapping.

Changes

Cohort / File(s)	Summary
Vectorized feature-to-block refactoring `dptb/data/interfaces/ham_to_feature.py`	Overhauled `feature_to_block` logic replacing per-atom and per-edge loops with scatter-based assembly. Introduced precomputed index mappings (node and edge). Removed `anglrMId` import. Added atom symbol precomputation and detailed docstrings describing vectorized workflow.
OrbitalMapper index mapping methods `dptb/data/transforms.py`	Added `get_node_feature_to_block_indices()` and `get_edge_feature_to_block_indices()` public methods to `OrbitalMapper` class. Each method precomputes and caches scatter-operation index mappings with src/dst tensors, diagonal indicators, and orbital counts per symbol or bond type.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: performance optimization of feature_to_block using torch_scatter-based vectorized operations.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

dptb/data/interfaces/ham_to_feature.py (1)
420-437: Critical bug in self-loop block storage: wrong index logic in atom_i == atom_j case.

Lines 426-431 have inverted logic. When processing a self-loop edge (i, i, R_shift), the code checks if the reverse key (i, i, -R_shift) exists in blocks, then conditionally stores or accumulates—but always at the wrong index:

If r_index doesn't exist: stores block at blocks[block_index] ✓

If r_index exists: stores at blocks[r_index] instead of blocks[block_index] ✗

This means the block for (i, i, R_shift) is never properly stored; it either overwrites or gets incorrectly accumulated into the opposite R_shift entry. For Hermitian systems with periodic boundaries, self-loop blocks with different shifts corrupt each other.

Fix: Treat self-loops like other edges—always store/accumulate at the current edge's block_index:
elif atom_i == atom_j:
    if blocks.get(block_index, None) is None:
        blocks[block_index] = block
    else:
        blocks[block_index] += block

🧹 Nitpick comments (4)

dptb/data/transforms.py (2)
848-905: Consider specifying device for precomputed index tensors.

The precomputed index tensors are created on CPU by default. When used in feature_to_block, they're moved to the target device on every call (lines 373-377 in ham_to_feature.py). For better performance, consider either:

Creating these tensors on self.device during precomputation, or

Caching the device-specific versions after first use
Option 1: Create on self.device during precomputation
         self._node_feature_to_block_indices[symbol] = {
-            'src': torch.tensor(src_indices, dtype=torch.long),
-            'dst': torch.tensor(dst_indices, dtype=torch.long),
-            'dst_T': torch.tensor(dst_indices_T, dtype=torch.long),
-            'is_diag': torch.tensor(is_diag, dtype=torch.bool),
+            'src': torch.tensor(src_indices, dtype=torch.long, device=self.device),
+            'dst': torch.tensor(dst_indices, dtype=torch.long, device=self.device),
+            'dst_T': torch.tensor(dst_indices_T, dtype=torch.long, device=self.device),
+            'is_diag': torch.tensor(is_diag, dtype=torch.bool, device=self.device),
             'norb': norb
         }
907-969: Consider specifying device for precomputed index tensors.

Similar to the node indices, the edge index tensors are created on CPU by default and moved to device on every call. Consider creating them on self.device during precomputation for better performance.
Proposed fix
         self._edge_feature_to_block_indices[bond_type] = {
-            'src': torch.tensor(src_indices, dtype=torch.long),
-            'dst': torch.tensor(dst_indices, dtype=torch.long),
-            'scale': torch.tensor(scale_factors, dtype=torch.float32),
+            'src': torch.tensor(src_indices, dtype=torch.long, device=self.device),
+            'dst': torch.tensor(dst_indices, dtype=torch.long, device=self.device),
+            'scale': torch.tensor(scale_factors, dtype=torch.float32, device=self.device),
             'norb_i': norb_i,
             'norb_j': norb_j
         }
dptb/data/interfaces/ham_to_feature.py (2)
362-365: Optimize atom symbol lookup to avoid repeated single-element untransform calls.

The current implementation calls idp.untransform() for each atom individually within a list comprehension. Since untransform supports batch operations, you can compute all symbols in one call and then convert to chemical symbols.
Proposed optimization
     # Pre-compute atom symbols for all atoms (vectorized lookup)
     atom_types = data[_keys.ATOM_TYPE_KEY]
+    atomic_numbers = idp.untransform(atom_types)
     atom_symbols = [
-        ase.data.chemical_symbols[idp.untransform(atom_types[i].reshape(-1))]
+        ase.data.chemical_symbols[int(atomic_numbers[i])]
         for i in range(len(atom_types))
     ]
373-377: Index device movement happens on every call.

The precomputed indices are moved to the target device on every feature_to_block call for each symbol/bond type. While the comment at line 372 says "only on first use per symbol," the implementation doesn't cache device-specific versions. This could impact performance when processing multiple batches.

Consider implementing device-specific caching or creating indices on the target device during precomputation (as suggested in the transforms.py review).

Also applies to: 408-412

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 63b9ccd and 904a27d.

📒 Files selected for processing (2)

dptb/data/interfaces/ham_to_feature.py
dptb/data/transforms.py

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build

🔇 Additional comments (5)

dptb/data/transforms.py (2)

845-846: LGTM!

The blank line improves readability by visually separating method definitions.

947-959: Clarify the scale factor logic and comment inconsistency.

The comment on line 929 states "0.5 for diagonal pairs," but line 947 checks is_same_basis (whether the basis pair is identical), not whether an element is on the matrix diagonal. These are not equivalent. Additionally, the edge version uses scale factors to handle symmetry, whereas the node version uses explicit transposed indices—the rationale for this design choice and the specific 0.5 factor for same basis pairs should be documented.

dptb/data/interfaces/ham_to_feature.py (3)

9-9: LGTM!

Correctly removed unused anglrMId import after refactoring to scatter-based operations. The angular momentum handling is now encapsulated in the precomputed index methods.

323-335: LGTM!

Clear and comprehensive docstring that explains the vectorized approach and performance benefits.

391-391: No issues found. The block_index format is consistent with the auto-detection logic in block_to_feature. Line 391 writes 0-indexed blocks ([atom, atom, 0, 0, 0]), and the start_id logic at lines 50–58 correctly detects this format and retrieves blocks with matching indices.

Copilot

Pull request overview

This PR optimizes the feature_to_block function by introducing pre-computed scatter indices for vectorized operations. The optimization replaces nested Python loops with efficient PyTorch scatter operations, improving performance when converting feature vectors to Hamiltonian/overlap block matrices.

Key changes:

Added two new caching methods (get_node_feature_to_block_indices and get_edge_feature_to_block_indices) to pre-compute scatter index mappings
Refactored feature_to_block to use vectorized scatter operations instead of nested loops
Removed unused anglrMId import from ham_to_feature.py

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File	Description
dptb/data/transforms.py	Added `get_node_feature_to_block_indices` and `get_edge_feature_to_block_indices` methods to pre-compute and cache scatter indices for vectorized block matrix construction
dptb/data/interfaces/ham_to_feature.py	Refactored `feature_to_block` function to use pre-computed scatter indices with vectorized operations; removed unused import; added comprehensive docstring

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-26T06:22:38Z

dptb/data/interfaces/ham_to_feature.py

+            # Move indices to correct device (only on first use per symbol)
+            src_idx = idx_info['src'].to(device)
+            dst_idx = idx_info['dst'].to(device)
+            dst_idx_T = idx_info['dst_T'].to(device)
+            is_diag = idx_info['is_diag'].to(device)
+            norb = idx_info['norb']


The indices are being transferred to the device on every iteration, even for atoms of the same symbol. This creates redundant device transfers. Consider caching the device-transferred indices per symbol to avoid repeated transfers for atoms of the same type. The same issue exists in the edge processing loop at lines 407-412.

Copilot · 2025-12-26T06:22:38Z

dptb/data/interfaces/ham_to_feature.py

+            # Move indices to correct device
+            src_idx = idx_info['src'].to(device)
+            dst_idx = idx_info['dst'].to(device)
+            scale = idx_info['scale'].to(device=device, dtype=dtype)
+            norb_i = idx_info['norb_i']
+            norb_j = idx_info['norb_j']


The indices are being transferred to the device on every iteration, even for edges with the same bond type. This creates redundant device transfers. Consider caching the device-transferred indices per bond type to avoid repeated transfers.

Copilot · 2025-12-26T06:22:38Z

dptb/data/interfaces/ham_to_feature.py

+            symbol_i = atom_symbols[atom_i]
+            symbol_j = atom_symbols[atom_j]
+            bond_type = f"{symbol_i}-{symbol_j}"
+


Missing error handling for KeyError when bond_type is not found in edge_indices. If a bond type exists in the data but wasn't pre-computed (e.g., due to dynamic data), this will raise an unhelpful KeyError. Consider adding a check with a descriptive error message, or ensure the pre-computation covers all possible bond types from the data.

Suggested change

if bond_type not in edge_indices:

available = ", ".join(sorted(map(str, edge_indices.keys())))

msg = (

f"Missing precomputed edge indices for bond type '{bond_type}'. "

f"Encountered edge between atoms {atom_i} ({symbol_i}) and "

f"{atom_j} ({symbol_j}) with cell shift {list(map(int, R_shift))}. "

f"Available bond types in edge_indices: {available if available else 'none'}."

)

log.error(msg)

raise KeyError(msg)

Copilot · 2025-12-26T06:22:39Z

dptb/data/interfaces/ham_to_feature.py

-                    block[slice_i, slice_j] = block_ij
-                    if slice_i != slice_j:
-                        block[slice_j, slice_i] = block_ij.T
+            # Move indices to correct device (only on first use per symbol)


The comment "only on first use per symbol" is misleading since the indices are actually transferred to the device on every iteration through the loop, not just the first time each symbol is encountered. Either the comment should be updated to reflect the actual behavior, or the code should be optimized to cache device-transferred indices per symbol as the comment suggests.

Suggested change

# Move indices to correct device (only on first use per symbol)

# Move indices to the correct device for this atom

Copilot · 2025-12-26T06:22:39Z

dptb/data/transforms.py

+            self._edge_feature_to_block_indices[bond_type] = {
+                'src': torch.tensor(src_indices, dtype=torch.long),
+                'dst': torch.tensor(dst_indices, dtype=torch.long),
+                'scale': torch.tensor(scale_factors, dtype=torch.float32),


The scale tensor is created with dtype=torch.float32 hardcoded, but it should match the dtype of the edge features to avoid potential type conversion issues or precision mismatches. Consider using the same dtype as the feature vectors, or at least the default dtype from the configuration.

Suggested change

'scale': torch.tensor(scale_factors, dtype=torch.float32),

'scale': torch.tensor(scale_factors, dtype=torch.get_default_dtype()),

Perf: accelerate feature_to_block with torch_scatter

904a27d

coderabbitai bot reviewed Dec 26, 2025

View reviewed changes

AsymmetryChou requested review from Copilot and floatingCatty December 26, 2025 04:58

Copilot started reviewing on behalf of AsymmetryChou December 26, 2025 06:19 View session

Copilot AI reviewed Dec 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Perf: accelerate feature_to_block with torch_scatter #302

Perf: accelerate feature_to_block with torch_scatter #302

Uh oh!

AsymmetryChou commented Dec 26, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 26, 2025

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 26, 2025

Uh oh!

Copilot AI Dec 26, 2025

Uh oh!

Copilot AI Dec 26, 2025

Uh oh!

Copilot AI Dec 26, 2025

Uh oh!

Copilot AI Dec 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

+            if bond_type not in edge_indices:
+                available = ", ".join(sorted(map(str, edge_indices.keys())))
+                msg = (
+                    f"Missing precomputed edge indices for bond type '{bond_type}'. "
+                    f"Encountered edge between atoms {atom_i} ({symbol_i}) and "
+                    f"{atom_j} ({symbol_j}) with cell shift {list(map(int, R_shift))}. "
+                    f"Available bond types in edge_indices: {available if available else 'none'}."
+                )
+                log.error(msg)
+                raise KeyError(msg)

	# Move indices to correct device (only on first use per symbol)
	# Move indices to the correct device for this atom

	'scale': torch.tensor(scale_factors, dtype=torch.float32),
	'scale': torch.tensor(scale_factors, dtype=torch.get_default_dtype()),

Perf: accelerate feature_to_block with torch_scatter #302

Are you sure you want to change the base?

Perf: accelerate feature_to_block with torch_scatter #302

Uh oh!

Conversation

AsymmetryChou commented Dec 26, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 26, 2025

Walkthrough

Changes

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

AsymmetryChou commented Dec 26, 2025 •

edited by coderabbitai bot

Loading