Skip to content

Conversation

acgetchell
Copy link
Owner

This pull request introduces major improvements to the CI benchmarking and profiling infrastructure, focusing on maintainability, modularity, and comprehensive performance analysis. The main changes are the refactoring of workflow scripts to use a centralized Python utility (benchmark-utils), the addition of a new profiling workflow for in-depth performance and memory analysis, and updates to coverage and benchmarking configuration for improved accuracy and reporting.

Workflow Refactoring and Benchmark Automation

  • Refactored .github/workflows/benchmarks.yml to replace inline Bash scripts with calls to the Python utility benchmark-utils, streamlining baseline preparation, status handling, commit extraction, benchmark skipping logic, regression testing, and summary generation. This makes the workflow easier to maintain and extend. [1] [2]
  • Updated .github/workflows/generate-baseline.yml to use benchmark-utils for tag determination, baseline generation, metadata creation, artifact name sanitization, and summary display, removing manual Bash commands and dependency installation. [1] [2] [3]

Profiling and Memory Analysis

  • Added a new workflow .github/workflows/profiling-benchmarks.yml for comprehensive profiling benchmarks and memory stress testing. This workflow supports manual, scheduled, and release-triggered runs, and provides detailed summaries and artifact uploads for profiling data.

Configuration and Coverage

  • Updated .codecov.yml to ignore benchmark files in coverage reports, ensuring more accurate code coverage metrics.
  • Modified Cargo.toml to register the new profiling_suite benchmark and configure tarpaulin to exclude benchmarks and examples from coverage analysis.

Documentation Cleanup

  • Removed outdated Python code quality improvement notes from WARP.md to keep documentation focused and relevant.

Streamlines benchmark workflow by replacing complex bash scripts with Python utility functions. This simplifies maintenance, improves code readability, and reduces the risk of errors.

The changes encompass baseline generation, comparison, commit extraction, skip determination and result display within GitHub Actions workflows.
This commit introduces a comprehensive profiling suite for in-depth performance analysis and a separate memory stress test job.

The profiling suite includes:
- Large-scale triangulation performance analysis (10³-10⁶ points)
- Multiple point distributions (random, grid, Poisson disk)
- Memory allocation tracking (with `count-allocations` feature)
- Query latency analysis
- Multi-dimensional scaling (2D-5D)
- Algorithmic bottleneck identification

It's integrated into GitHub Actions with scheduled runs and manual triggering, along with uploading profiling results and baselines.

The memory stress test runs independently to exercise allocation APIs and memory scaling under load.

Also, ignores the "benches/**" directory in codecov, and adds the profiling suite to the README.md.
@Copilot Copilot AI review requested due to automatic review settings September 6, 2025 00:06
Copy link

coderabbitai bot commented Sep 6, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Caution

Review failed

The pull request is closed.

Walkthrough

Refactors benchmarking workflows to a Python CLI (benchmark-utils), adds a comprehensive profiling GitHub Actions workflow and a Criterion profiling bench, introduces high-performance collection types and replacements across core triangulation code, adds grid/Poisson point generators, expands CLI tests, updates Cargo and coverage exclusions, and adjusts docs/READMEs.

Changes

Cohort / File(s) Summary of changes
Coverage config
\.codecov\.yml
Add benches/** to coverage.ignore.
Workflows — benchmarks / baseline
.github/workflows/benchmarks.yml, .github/workflows/generate-baseline.yml
Replace inline Bash with uv run benchmark-utils CLI steps for baseline preparation, tag determination, skip logic, extraction, regression runs, results display, and summary; remove jq and many manual shell checks.
New workflow — profiling
.github/workflows/profiling-benchmarks.yml
Add scheduled/manual profiling workflow with profiling and memory-stress jobs, artifact upload, tag baseline handling, and profiling-summary generation.
Benchmarking CLI
scripts/benchmark_utils.py, scripts/tests/test_benchmark_utils.py
Add WorkflowHelper and BenchmarkRegressionHelper classes, command dispatcher, GitHub Actions env/output integration; add comprehensive unit tests for new helpers and flows.
Profiling bench & docs
benches/profiling_suite.rs, benches/README.md
Add comprehensive Criterion profiling_suite, document profiling vs CI suites, triggers, artifacts, and usage.
Cargo / coverage tooling
Cargo.toml
Add profiling_suite bench, bump clap/uuid, add dependencies (fxhash, smallvec) and [tool.tarpaulin] excludes.
Core collections & refactors
src/core/collections.rs, src/core/triangulation_data_structure.rs, src/lib.rs
Add high-performance collection type aliases (FastHashMap/Set, SmallBuffer) and helpers; refactor triangulation internals to use them; change build_facet_to_cells_hashmap return type and expose core::collections module.
Geometry utilities
src/geometry/util.rs, src/geometry/predicates.rs
Add generate_grid_points and generate_poisson_points with tests; add #[inline] to simplex_orientation and insphere.
Convex hull adjustments
src/geometry/algorithms/convex_hull.rs
Switch internal facet-to-cells cache to FacetToCellsMap and replace some HashMap usages with FastHashMap.
Examples/tests cleanup
examples/*.rs
Remove example-local #[cfg(test)] test modules from the two example files.
Spellings
cspell.json
Add words ahash and seahash.
Docs
WARP.md, docs/code_organization.md
Add TDD guidelines, file organization guidance, expanded util.rs docs; remove a Python quality subsection.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor GH as GitHub Actions
  participant CLI as benchmark-utils (uv)
  participant Repo as Repository
  participant Art as Artifacts

  GH->>CLI: prepare-baseline
  CLI->>Repo: check baseline files
  CLI-->>GH: set BASELINE_* env

  GH->>CLI: extract-baseline-commit
  CLI->>Repo: read baseline_results / metadata
  CLI-->>GH: BASELINE_COMMIT

  GH->>CLI: determine-skip --baseline-commit --current-commit
  CLI->>Repo: check git for relevant changes
  alt skip
    CLI-->>GH: SKIP with reason
    GH->>CLI: display-skip-message
  else run
    GH->>CLI: run-regression-test --baseline path
    CLI->>Repo: run comparison, write compare_results.txt
    CLI-->>GH: status
    GH->>CLI: display-results
    GH->>Art: upload compare_results.txt
  end

  GH->>CLI: generate-summary
  CLI-->>GH: summary text
Loading
sequenceDiagram
  autonumber
  actor GH as GitHub Actions
  participant CLI as benchmark-utils (uv)
  participant Art as Artifacts

  GH->>CLI: determine-tag
  CLI-->>GH: TAG_NAME (GITHUB_OUTPUT)

  GH->>CLI: generate-baseline --tag TAG_NAME --output baseline-artifact/...
  CLI-->>GH: baseline file created

  GH->>CLI: create-metadata --tag TAG_NAME
  CLI-->>GH: metadata.json

  GH->>CLI: sanitize-artifact-name --tag TAG_NAME
  CLI-->>GH: artifact_name

  GH->>Art: upload baseline-artifact/baseline-TAG_NAME.txt
Loading
sequenceDiagram
  autonumber
  actor GH as GitHub Actions
  participant Cargo as Rust/cargo
  participant Crit as Criterion benches
  participant Art as Artifacts

  GH->>Cargo: cargo build --release --bench profiling_suite
  GH->>Cargo: cargo bench --bench profiling_suite [--features count-allocations] [--filter]
  Cargo->>Crit: run profiling groups (scaling, memory, queries, bottlenecks)
  Crit-->>Cargo: generate logs & criterion outputs
  Cargo-->>GH: profiling logs + artifact dirs
  opt tag push
    GH->>Art: upload profiling-baseline-<tag> (long retention)
  end
  GH->>Art: upload profiling-results (standard retention)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60–90 minutes

Possibly related PRs

Poem

I hop through code with tiny paws,
New benches hum, the profiler claws.
Grids and disks in ordered play,
A Python scribe to guide the day.
Baselines saved — I thump hooray! 🐇✨


📜 Recent review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8a5f176 and 547a45d.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (12)
  • .github/workflows/profiling-benchmarks.yml (1 hunks)
  • Cargo.toml (3 hunks)
  • WARP.md (1 hunks)
  • benches/README.md (2 hunks)
  • cspell.json (2 hunks)
  • examples/convex_hull_3d_50_points.rs (0 hunks)
  • examples/triangulation_3d_50_points.rs (0 hunks)
  • src/core/collections.rs (1 hunks)
  • src/core/triangulation_data_structure.rs (15 hunks)
  • src/geometry/algorithms/convex_hull.rs (3 hunks)
  • src/geometry/predicates.rs (2 hunks)
  • src/lib.rs (1 hunks)
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch refactor/profiling

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces major improvements to the CI benchmarking and profiling infrastructure, focusing on maintainability, modularity, and comprehensive performance analysis. The key changes consolidate inline Bash scripts into a centralized Python utility and add extensive profiling capabilities.

  • Refactored CI workflows to use the centralized benchmark-utils Python utility instead of inline Bash scripts
  • Added comprehensive profiling suite with large-scale benchmarking (10³-10⁶ points) and memory analysis capabilities
  • Enhanced point generation utilities with grid and Poisson disk sampling methods for more realistic benchmark scenarios

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/geometry/util.rs Added grid and Poisson disk point generation functions with comprehensive testing
scripts/benchmark_utils.py Added WorkflowHelper and BenchmarkRegressionHelper classes to centralize CI automation
scripts/tests/test_benchmark_utils.py Added extensive test coverage for new utility classes
benches/profiling_suite.rs New comprehensive profiling benchmark suite for optimization work
benches/README.md Updated documentation for new profiling capabilities and GitHub Actions integration
.github/workflows/profiling-benchmarks.yml New workflow for comprehensive profiling with manual and scheduled triggers
.github/workflows/benchmarks.yml Refactored to use Python utilities instead of inline Bash scripts
.github/workflows/generate-baseline.yml Simplified baseline generation using centralized utilities
Cargo.toml Registered new profiling suite benchmark and configured tarpaulin exclusions
.codecov.yml Excluded benchmark files from coverage analysis

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

attempts += 1;

// Generate candidate point
let coords = [T::zero(); D].map(|_| rng.random_range(bounds.0..bounds.1));
Copy link
Preview

Copilot AI Sep 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The array mapping operation creates and discards T::zero() values unnecessarily. Consider using std::iter::repeat_with() or a for loop to avoid creating unused zero values.

Copilot uses AI. Check for mistakes.

return False, "invalid_baseline_sha"

commit_ref = f"{baseline_commit}^{{commit}}"
subprocess.run(["git", "cat-file", "-e", commit_ref], check=True, capture_output=True) # noqa: S603,S607
Copy link
Preview

Copilot AI Sep 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Direct subprocess calls with user-provided input pose security risks. The baseline_commit parameter should be validated more strictly beyond the regex check to prevent command injection, or use a Git library instead of subprocess calls.

Copilot uses AI. Check for mistakes.


# Check for relevant changes
diff_range = f"{baseline_commit}..HEAD"
result = subprocess.run(["git", "diff", "--name-only", diff_range], capture_output=True, text=True, check=True) # noqa: S603,S607
Copy link
Preview

Copilot AI Sep 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Direct subprocess calls with user-provided input pose security risks. The baseline_commit parameter should be validated more strictly beyond the regex check to prevent command injection, or use a Git library instead of subprocess calls.

Copilot uses AI. Check for mistakes.

clippy::cast_possible_truncation,
clippy::cast_sign_loss
)]
let points_per_dim = ((count as f64).powf(1.0 / D as f64).ceil() as usize).max(2);
Copy link
Preview

Copilot AI Sep 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple type conversions and floating-point operations are inefficient for grid generation. Consider pre-computing common point counts or using integer-based calculations where possible.

Suggested change
let points_per_dim = ((count as f64).powf(1.0 / D as f64).ceil() as usize).max(2);
// Integer-based calculation: find smallest n such that n.pow(D) >= count
let mut points_per_dim = 2;
while points_per_dim.pow(D as u32) < count {
points_per_dim += 1;
}

Copilot uses AI. Check for mistakes.

Returns:
Sanitized artifact name
"""
import re
Copy link
Preview

Copilot AI Sep 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import re statement is inside the function, but re is already imported at the module level (line 19). Remove the redundant local import.

Suggested change
import re

Copilot uses AI. Check for mistakes.

Copy link

codacy-production bot commented Sep 6, 2025

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation Diff coverage
+0.25% (target: -1.00%) 70.09%
Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (272d5b1) 2522 1754 69.55%
Head commit (547a45d) 2609 (+87) 1821 (+67) 69.80% (+0.25%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#67) 107 75 70.09%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings    Change summary preferences

Copy link

codecov bot commented Sep 6, 2025

Codecov Report

❌ Patch coverage is 78.94737% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.44%. Comparing base (272d5b1) to head (547a45d).

Files with missing lines Patch % Lines
src/geometry/util.rs 76.92% 15 Missing ⚠️
src/core/collections.rs 60.00% 4 Missing ⚠️
src/core/triangulation_data_structure.rs 94.73% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #67      +/-   ##
==========================================
+ Coverage   69.60%   70.44%   +0.84%     
==========================================
  Files          18       17       -1     
  Lines        2520     2585      +65     
==========================================
+ Hits         1754     1821      +67     
+ Misses        766      764       -2     
Flag Coverage Δ
unittests 70.44% <78.94%> (+0.84%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@coderabbitai coderabbitai bot added documentation Improvements or additions to documentation rust Pull requests that update rust code labels Sep 6, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

🧹 Nitpick comments (26)
src/geometry/util.rs (2)

1391-1457: Validate min_distance before sampling

Add basic checks that min_distance is finite and > 0 to fail fast on invalid inputs.

Apply:

-    // Validate bounds
+    // Validate bounds
     if bounds.0 >= bounds.1 {
         return Err(RandomPointGenerationError::InvalidRange {
             min: format!("{:?}", bounds.0),
             max: format!("{:?}", bounds.1),
         });
     }
+
+    // Validate spacing
+    if !(min_distance > T::zero()) {
+        return Err(RandomPointGenerationError::RandomGenerationFailed {
+            min: format!("{:?}", bounds.0),
+            max: format!("{:?}", bounds.1),
+            details: "min_distance must be > 0".to_string(),
+        });
+    }

3616-3629: Use approx assertions for float equality in tests

Prefer assert_relative_eq! for readability and consistent failure messages.

Apply:

-        // Use approx for floating point comparison
-        for (actual, expected) in coords.iter().zip([0.0, 0.0, 0.0].iter()) {
-            assert!((actual - expected).abs() < 1e-15);
-        }
+        use approx::assert_relative_eq;
+        for (actual, expected) in coords.iter().zip([0.0, 0.0, 0.0].iter()) {
+            assert_relative_eq!(*actual, *expected, epsilon = 1e-15);
+        }
scripts/benchmark_utils.py (4)

702-715: Avoid re-importing re inside function

re is already imported at module scope; drop the inner import.

Apply:

-        import re
-
         # Replace any non-alphanumeric characters (except . _ -) with underscore
         clean_name = re.sub(r"[^a-zA-Z0-9._-]", "_", tag_name)

861-866: Rename shadowy variable for clarity

Use filename instead of file to avoid confusion and improve readability.

Apply:

-            changed_files = result.stdout.strip().split("\n") if result.stdout.strip() else []
-
-            has_relevant_changes = any(re.match(pattern, file) for file in changed_files for pattern in relevant_patterns)
+            changed_files = result.stdout.strip().split("\n") if result.stdout.strip() else []
+            has_relevant_changes = any(
+                re.match(pattern, filename) for filename in changed_files for pattern in relevant_patterns
+            )

882-888: Expand skip message coverage

Handle other reasons (e.g., invalid_baseline_sha, baseline_commit_not_found) for more informative logs.

Apply:

         messages = {
             "same_commit": f"🔍 Current commit matches baseline ({baseline_commit}); skipping benchmarks.",
             "no_relevant_changes": f"🔍 No relevant code changes since {baseline_commit}; skipping benchmarks.",
+            "invalid_baseline_sha": "⚠️ Baseline SHA format invalid; running benchmarks.",
+            "baseline_commit_not_found": "⚠️ Baseline commit not in history (force-push/shallow clone?); running benchmarks.",
+            "unknown_baseline": "⚠️ Unknown baseline; running benchmarks.",
         }

734-752: Set BASELINE_TAG env when available

prepare_baseline prints metadata but doesn’t export the tag, yet generate-summary expects BASELINE_TAG. Parse metadata.json and export it if present.

Apply:

         if baseline_file.exists():
             print("📦 Prepared baseline from artifact")
 
             # Set GitHub Actions environment variables
             github_env = os.getenv("GITHUB_ENV")
             if github_env:
                 with open(github_env, "a", encoding="utf-8") as f:
                     f.write("BASELINE_EXISTS=true\n")
                     f.write("BASELINE_SOURCE=artifact\n")
+                    # Try to load tag from metadata.json
+                    try:
+                        meta_path = baseline_dir / "metadata.json"
+                        if meta_path.exists():
+                            meta = json.loads(meta_path.read_text(encoding="utf-8"))
+                            tag = meta.get("tag", "")
+                            if tag:
+                                f.write(f"BASELINE_TAG={tag}\n")
+                                f.write("BASELINE_ORIGIN=metadata\n")
+                    except Exception:
+                        pass
scripts/tests/test_benchmark_utils.py (7)

747-840: Solid metadata coverage; consider asserting unknown fallbacks once.

These tests thoroughly cover happy-path and SAFE_ fallbacks. Optionally add one assertion that missing envs produce “unknown” (already done below) to avoid redundancy.


841-911: Prefer capsys over patching print for output assertions.

pytest’s capsys captures stdout/stderr without monkey-patching builtins, reducing flakiness and preserving print signatures.

Example:

def test_display_baseline_summary_success(capsys, tmp_path):
    # ... write baseline_file ...
    assert WorkflowHelper.display_baseline_summary(baseline_file)
    out = capsys.readouterr().out
    assert "📊 Baseline summary:" in out
    assert "Total benchmarks: 3" in out

912-952: Add empty/long tag edge cases for sanitizer.

Consider tests for empty tag ("") and very long tags to ensure stable, non-empty artifact names and no OS path issues.

Example:

def test_sanitize_artifact_name_empty():
    assert WorkflowHelper.sanitize_artifact_name("") == "performance-baseline-"

def test_sanitize_artifact_name_very_long():
    tag = "x"*256
    name = WorkflowHelper.sanitize_artifact_name(tag)
    assert name.startswith("performance-baseline-")
    assert len(name) == len("performance-baseline-") + 256

1138-1176: Cover additional skip branches.

Add tests for:

  • invalid baseline sha → ("invalid_baseline_sha")
  • generic exception during change check → ("error_checking_changes")

Example snippets:

def test_determine_benchmark_skip_invalid_sha():
    assert BenchmarkRegressionHelper.determine_benchmark_skip("zzz", "deadbeef") == (False, "invalid_baseline_sha")

@patch("subprocess.run", side_effect=Exception("boom"))
def test_determine_benchmark_skip_error(mock_run):
    assert BenchmarkRegressionHelper.determine_benchmark_skip("abc1234", "def5678") == (False, "error_checking_changes")

1194-1228: Also assert the “regression found” happy-path still returns success.

Add a variant where compare_with_baseline returns (True, True) to ensure run_regression_test still returns True.

mock_comparator.compare_with_baseline.return_value = (True, True)
assert BenchmarkRegressionHelper.run_regression_test(baseline_file)

1253-1312: Exercise “results file missing” branch when not skipped.

Add a test with BASELINE_EXISTS=true and SKIP_BENCHMARKS=false but no results file to hit “❓ … no results file found”.

with patch.dict(os.environ, {"BASELINE_EXISTS":"true","SKIP_BENCHMARKS":"false"}), patch("builtins.print") as p:
    BenchmarkRegressionHelper.generate_summary()
    assert any("no results file" in c.args[0] for c in p.call_args_list)

24-30: Use package-qualified imports for benchmark_utils in tests: In scripts/tests/test_benchmark_utils.py (lines 24–30), change

from benchmark_utils import

to

from scripts.benchmark_utils import

and update any @patch decorators (e.g. @patch("scripts.benchmark_utils.datetime")).

docs/code_organization.md (3)

62-70: Include new profiling bench in directory tree.

The benches list omits profiling_suite.rs added in this PR.

Proposed insertion:

 │   ├── microbenchmarks.rs                        # Fine-grained performance tests
 │   ├── triangulation_creation.rs                 # Triangulation creation benchmarks
+│   ├── profiling_suite.rs                        # Comprehensive profiling and memory stress suite
 │   └── triangulation_vs_hull_memory.rs           # Memory comparison benchmarks

103-111: List the new profiling workflow.

Add .github/workflows/profiling-benchmarks.yml to keep CI docs in sync.

 │   │   ├── generate-baseline.yml                 # Automated performance baseline generation on releases
+│   │   ├── profiling-benchmarks.yml              # Profiling and memory benchmarks
 │   │   └── rust-clippy.yml                       # Additional clippy analysis

574-581: Minor copy edits for clarity/consistency.

Hyphenate compounds and remove trailing double-spaces.

-#### `util.rs` (3,806 lines)
-
-- Function-focused (not struct-focused)  
-- Comprehensive test coverage with systematic multi-dimensional testing
-- Point generation utilities (random, grid, and Poisson disk sampling)
+- #### `util.rs` (3,806 lines)
+  - Function-focused (not struct-focused)
+  - Comprehensive test coverage with systematic multi-dimensional testing
+  - Point-generation utilities (random, grid, and Poisson‑disk sampling)
   - Function categorization by geometric operations (circumcenter, facet measure, surface measure, etc.)
-  - Multi-dimensional testing across 2D-5D with both f32 and f64 coordinate types
-  - Extensive edge case testing and error handling validation
+  - Multi‑dimensional testing across 2D–5D with both f32 and f64 coordinate types
+  - Extensive edge‑case testing and error‑handling validation
.github/workflows/generate-baseline.yml (1)

100-111: Use the sanitized artifact name in the summary for consistency.

You upload with the sanitized name; reflect the same in the “Display next steps” section.

-          echo "  Artifact: performance-baseline-$TAG_NAME"
+          echo "  Artifact: ${{ steps.safe_name.outputs.artifact_name }}"
benches/README.md (3)

47-57: Wording: “decades” → “orders of magnitude.”

Replace “across multiple decades” with “across multiple orders of magnitude” to avoid ambiguity.

-- **Large-scale triangulation performance** (10³ to 10⁶ points across multiple decades)
+- **Large-scale triangulation performance** (10³ to 10⁶ points across multiple orders of magnitude)

61-66: Clarify feature flag requirement.

Explicitly note that memory tracking requires building benches with the feature.

-# Run all benchmarks with memory tracking
-cargo bench --features count-allocations
+# Run all benchmarks with memory tracking (requires feature flag)
+cargo bench --features count-allocations

423-455: Minor copyedits for consistency and readability.

Hyphenate “run-time” consistently and remove double spaces; no functional impact.

Example edits:

  • “~5-10 minutes ” -> “~5–10 minutes”
  • “1-2 hours (full production mode)” -> “1–2 hours (full production mode)”
.github/workflows/profiling-benchmarks.yml (1)

115-123: Avoid brittle expression logic in heredoc.

Prefer a single shell-evaluated MODE variable to avoid expression edge cases.

-        **Mode**: ${{ env.PROFILING_DEV_MODE && 'Development' || 'Production' }}
+        **Mode**: ${MODE:-Production}
...
-        - **Profiling Mode**: ${{ env.PROFILING_DEV_MODE && 'Development (reduced scale for faster iteration)' || 'Production (full 10³-10⁶ point scale)' }}
+        - **Profiling Mode**: ${MODE_DESC:-Production (full 10³-10⁶ point scale)}

Prepend in an earlier run step:

if [[ -n "${PROFILING_DEV_MODE:-}" ]]; then
  echo "MODE=Development" >> "$GITHUB_ENV"
  echo "MODE_DESC=Development (reduced scale for faster iteration)" >> "$GITHUB_ENV"
else
  echo "MODE=Production" >> "$GITHUB_ENV"
  echo "MODE_DESC=Production (full 10³-10⁶ point scale)" >> "$GITHUB_ENV"
fi
benches/profiling_suite.rs (5)

6-12: Doc wording nit: “decades” → “orders of magnitude.”

Avoid ambiguity; you mean scale, not time.

-//! 1. **Large-scale triangulation performance** (10³ to 10⁶ points)
+//! 1. **Large-scale triangulation performance** (10³ to 10⁶ points, across multiple orders of magnitude)

129-137: Trim grid overshoot to requested count.

generate_grid_points(points_per_dim^D) can substantially exceed count; truncate to avoid unnecessary memory/time.

-            generate_grid_points(points_per_dim, 10.0, [0.0; D]).unwrap()
+            {
+                let mut pts = generate_grid_points(points_per_dim, 10.0, [0.0; D]).unwrap();
+                if pts.len() > count { pts.truncate(count); }
+                pts
+            }

155-194: Use per-group sample sizes to bound wall-clock time.

These long-running groups already extend measurement_time; also reduce sample_size locally to keep runs predictable.

Example:

group.sample_size(10); // or even 5 in production mode

Place after group creation for 2D/3D/4D groups.

Also applies to: 196-229, 237-265


333-371: Only print allocation summaries when meaningful.

Without count-allocations, summaries are zeros; gate on non-zero peak.

-                        print_alloc_summary(&avg_info, "2D Triangulation", count);
+                        if avg_info.bytes_max > 0 {
+                            print_alloc_summary(&avg_info, "2D Triangulation", count);
+                        }

Apply same check to the 3D block.

Also applies to: 405-443


489-527: Bound query count by cells to avoid quadratic blow-ups on dense meshes.

Cap iterated cells (e.g., take first N) in addition to the 1000-result cap to keep per-iteration work stable.

-                    for cell in tds.cells() {
+                    for cell in tds.cells().take(250) {
📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 272d5b1 and 8a5f176.

📒 Files selected for processing (12)
  • .codecov.yml (1 hunks)
  • .github/workflows/benchmarks.yml (2 hunks)
  • .github/workflows/generate-baseline.yml (2 hunks)
  • .github/workflows/profiling-benchmarks.yml (1 hunks)
  • Cargo.toml (1 hunks)
  • WARP.md (1 hunks)
  • benches/README.md (2 hunks)
  • benches/profiling_suite.rs (1 hunks)
  • docs/code_organization.md (1 hunks)
  • scripts/benchmark_utils.py (4 hunks)
  • scripts/tests/test_benchmark_utils.py (3 hunks)
  • src/geometry/util.rs (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
scripts/*.py

⚙️ CodeRabbit configuration file

scripts/*.py: These are Python utility scripts for changelog and commit processing.
Focus on code quality, maintainability, and adherence to Python best practices.
The scripts use ruff for comprehensive linting and formatting (pylint has been retired).

Files:

  • scripts/benchmark_utils.py
🧠 Learnings (1)
📚 Learning: 2025-09-04T20:03:49.859Z
Learnt from: acgetchell
PR: acgetchell/delaunay#65
File: WARP.md:249-254
Timestamp: 2025-09-04T20:03:49.859Z
Learning: The delaunay repository contains all the specialized integration tests documented in WARP.md: circumsphere_debug_tools.rs, robust_predicates_comparison.rs, convex_hull_bowyer_watson_integration.rs, and allocation_api.rs, plus additional test files like robust_predicates_showcase.rs and coordinate_conversion_errors.rs.

Applied to files:

  • docs/code_organization.md
🧬 Code graph analysis (4)
src/geometry/util.rs (1)
src/geometry/point.rs (2)
  • try_from (313-336)
  • new (74-76)
scripts/tests/test_benchmark_utils.py (1)
scripts/benchmark_utils.py (18)
  • BenchmarkRegressionHelper (718-988)
  • CriterionParser (78-179)
  • PerformanceComparator (273-574)
  • WorkflowHelper (577-715)
  • determine_tag_name (581-606)
  • create_metadata (609-653)
  • display_baseline_summary (656-689)
  • sanitize_artifact_name (692-715)
  • prepare_baseline (722-761)
  • set_no_baseline_status (764-774)
  • extract_baseline_commit (777-827)
  • determine_benchmark_skip (830-871)
  • display_skip_message (874-887)
  • display_no_baseline_message (890-900)
  • compare_with_baseline (281-338)
  • run_regression_test (903-930)
  • display_results (933-945)
  • generate_summary (948-988)
benches/profiling_suite.rs (2)
src/geometry/util.rs (36)
  • generate_grid_points (1290-1352)
  • generate_grid_points (3547-3547)
  • generate_grid_points (3580-3580)
  • generate_grid_points (3596-3596)
  • generate_grid_points (3612-3612)
  • generate_grid_points (3621-3621)
  • generate_grid_points (3635-3635)
  • generate_grid_points (3648-3648)
  • generate_grid_points (3661-3661)
  • generate_poisson_points (1390-1457)
  • generate_poisson_points (3679-3679)
  • generate_poisson_points (3712-3712)
  • generate_poisson_points (3745-3745)
  • generate_poisson_points (3746-3746)
  • generate_poisson_points (3760-3760)
  • generate_poisson_points (3767-3767)
  • generate_poisson_points (3778-3778)
  • generate_poisson_points (3794-3794)
  • generate_poisson_points (3817-3817)
  • generate_poisson_points (3829-3829)
  • generate_random_points_seeded (1224-1251)
  • generate_random_points_seeded (3309-3309)
  • generate_random_points_seeded (3310-3310)
  • generate_random_points_seeded (3329-3329)
  • generate_random_points_seeded (3330-3330)
  • generate_random_points_seeded (3348-3348)
  • generate_random_points_seeded (3349-3349)
  • generate_random_points_seeded (3367-3367)
  • generate_random_points_seeded (3368-3368)
  • generate_random_points_seeded (3387-3387)
  • generate_random_points_seeded (3388-3388)
  • generate_random_points_seeded (3392-3392)
  • generate_random_points_seeded (3393-3393)
  • generate_random_points_seeded (3397-3397)
  • generate_random_points_seeded (3398-3398)
  • generate_random_points_seeded (3402-3402)
src/geometry/predicates.rs (1)
  • insphere_lifted (598-714)
scripts/benchmark_utils.py (2)
scripts/tests/test_benchmark_utils.py (3)
  • comparator (214-217)
  • comparator (504-507)
  • comparator (621-624)
scripts/hardware_utils.py (1)
  • main (578-625)
🪛 LanguageTool
docs/code_organization.md

[grammar] ~580-~580: There might be a mistake here.
Context: ...D with both f32 and f64 coordinate types - Extensive edge case testing and error ha...

(QB_NEW_EN)


[grammar] ~581-~581: There might be a mistake here.
Context: ...se testing and error handling validation - Generic type coverage ### Key Conventio...

(QB_NEW_EN)

benches/README.md

[grammar] ~49-~49: There might be a mistake here.
Context: ...³ to 10⁶ points across multiple decades) - Multiple point distributions (random, ...

(QB_NEW_EN)


[grammar] ~50-~50: There might be a mistake here.
Context: ...ributions** (random, grid, Poisson disk) - Memory allocation tracking (requires `...

(QB_NEW_EN)


[grammar] ~51-~51: There might be a mistake here.
Context: ...requires --features count-allocations) - Query latency analysis (circumsphere t...

(QB_NEW_EN)


[grammar] ~52-~52: There might be a mistake here.
Context: ...* (circumsphere tests, neighbor queries) - Multi-dimensional scaling (2D through ...

(QB_NEW_EN)


[grammar] ~53-~53: There might be a mistake here.
Context: ...ti-dimensional scaling** (2D through 5D) - Algorithmic bottleneck identification ...

(QB_NEW_EN)


[grammar] ~429-~429: There might be a mistake here.
Context: ...ance Suite** (ci_performance_suite.rs) - Purpose: Fast performance regression d...

(QB_NEW_EN)


[grammar] ~430-~430: There might be a mistake here.
Context: ...e regression detection for regular CI/CD - Runtime: ~5-10 minutes - Scale: S...

(QB_NEW_EN)


[grammar] ~431-~431: There might be a mistake here.
Context: ...gular CI/CD - Runtime: ~5-10 minutes - Scale: Small point counts (10, 25, 50 ...

(QB_NEW_EN)


[grammar] ~432-~432: There might be a mistake here.
Context: ...: Small point counts (10, 25, 50 points) - Frequency: Every PR and push to main -...

(QB_NEW_EN)


[grammar] ~433-~433: There might be a mistake here.
Context: ...Frequency: Every PR and push to main - Integration: `.github/workflows/benchm...

(QB_NEW_EN)


[grammar] ~436-~436: There might be a mistake here.
Context: ...Profiling Suite** (profiling_suite.rs) - Purpose: Comprehensive performance ana...

(QB_NEW_EN)


[grammar] ~437-~437: There might be a mistake here.
Context: ...rformance analysis for optimization work - Runtime: 1-2 hours (full production mo...

(QB_NEW_EN)


[grammar] ~438-~438: There might be a mistake here.
Context: ...time**: 1-2 hours (full production mode) - Scale: Large point counts (10³ to 10⁶ ...

(QB_NEW_EN)


[grammar] ~439-~439: There might be a mistake here.
Context: ...: Large point counts (10³ to 10⁶ points) - Frequency: Monthly scheduled + manual ...

(QB_NEW_EN)


[grammar] ~440-~440: There might be a mistake here.
Context: ...heduled + manual triggers + release tags - Integration: `.github/workflows/profil...

(QB_NEW_EN)


[grammar] ~459-~459: There might be a mistake here.
Context: ... as GitHub Actions artifacts for 30 days - Profiling Baselines: Release-tagged ba...

(QB_NEW_EN)


[grammar] ~460-~460: There might be a mistake here.
Context: ...elease-tagged baselines kept for 90 days - Memory Analysis: Detailed allocation t...

(QB_NEW_EN)


[grammar] ~461-~461: There might be a mistake here.
Context: ...racking with count-allocations feature - HTML Reports: Criterion-generated perf...

(QB_NEW_EN)

🪛 actionlint (1.7.7)
.github/workflows/profiling-benchmarks.yml

181-181: "needs" section should not be empty

(syntax-check)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Codacy Static Code Analysis
  • GitHub Check: performance-regression
  • GitHub Check: Codacy Security Scan
  • GitHub Check: build (ubuntu-latest)
  • GitHub Check: build (windows-latest)
  • GitHub Check: Code Coverage
🔇 Additional comments (15)
.codecov.yml (1)

15-15: LGTM: benches excluded from coverage

Ignoring benches/** is appropriate to keep coverage signal focused on production/test code.

WARP.md (1)

259-271: TDD Guidelines section is unique. Only one occurrence found at line 259 in WARP.md, so no duplication risk.

scripts/benchmark_utils.py (5)

1063-1078: Exit code mapping looks good

compare exits 1 on regression to fail CI appropriately. No changes needed.


991-1048: CLI surface is comprehensive and cohesive

Argument parsing for workflow and regression helpers aligns with the new workflows. Nice.


577-607: Solid tag derivation and GITHUB_OUTPUT usage

determine_tag_name correctly handles push tags vs manual runs and writes outputs safely.


139-157: Throughput and unit normalization logic is sensible

Rounding to two decimals for time and µs normalization in comparisons are reasonable choices.


228-239: No action required—bench target exists
The ci_performance_suite bench is defined in Cargo.toml and implemented in benches/ci_performance_suite.rs.

scripts/tests/test_benchmark_utils.py (5)

699-711: LGTM: tag-name from GITHUB_REF.

Good use of getenv patching and environment isolation.


711-728: LGTM: generated tag-name path.

Time mocking is correct (datetime.now() only).


729-746: LGTM: GITHUB_OUTPUT integration covered.

Verifies output file contract; nice.


954-997: LGTM: baseline prep sets env and logs first lines.

Nice validation of both env file writes and console output.


1102-1123: LGTM: unknown baseline commit path.

Correctly asserts env default to “unknown”.

Cargo.toml (2)

74-77: LGTM: new bench target registered.

profiling_suite is correctly added with harness = false.


1-5: Toolchain configuration is correct The rust-toolchain.toml pins channel = "1.89.0", which satisfies the Cargo.toml’s rust-version = "1.89.0" and supports edition = "2024".

.github/workflows/benchmarks.yml (1)

238-246: Great: centralizing prep via CLI.

ENV wiring for BASELINE_ORIGIN and BASELINE_TAG looks clean and predictable.

Comment on lines 278 to 284
echo " Baseline source: ${BASELINE_SOURCE:-unknown}"
echo " Baseline origin: ${BASELINE_ORIGIN:-unknown}"

# This will exit with code 1 if significant regressions are found
echo " Using CI performance suite against ${BASELINE_ORIGIN:-unknown} baseline"
BASELINE_PATH="baseline-artifact/baseline_results.txt"
if uv run benchmark-utils --help >/dev/null 2>&1; then
uv run benchmark-utils compare --baseline "$BASELINE_PATH"
elif uv run python -c "import importlib; importlib.import_module('scripts.benchmark_utils')" \
>/dev/null 2>&1; then
uv run python -m scripts.benchmark_utils compare --baseline "$BASELINE_PATH"
else
echo "❌ benchmark-utils entrypoint and module not found" >&2
exit 2
fi

# Run regression test using Python utility
uv run benchmark-utils run-regression-test \
--baseline "baseline-artifact/baseline_results.txt"

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix env var name when echoing baseline metadata.

You set BASELINE_ORIGIN earlier; here you echo BASELINE_SOURCE (unset).

-          echo "   Baseline source: ${BASELINE_SOURCE:-unknown}"
+          echo "   Baseline source: ${BASELINE_ORIGIN:-unknown}"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
echo " Baseline source: ${BASELINE_SOURCE:-unknown}"
echo " Baseline origin: ${BASELINE_ORIGIN:-unknown}"
# This will exit with code 1 if significant regressions are found
echo " Using CI performance suite against ${BASELINE_ORIGIN:-unknown} baseline"
BASELINE_PATH="baseline-artifact/baseline_results.txt"
if uv run benchmark-utils --help >/dev/null 2>&1; then
uv run benchmark-utils compare --baseline "$BASELINE_PATH"
elif uv run python -c "import importlib; importlib.import_module('scripts.benchmark_utils')" \
>/dev/null 2>&1; then
uv run python -m scripts.benchmark_utils compare --baseline "$BASELINE_PATH"
else
echo "❌ benchmark-utils entrypoint and module not found" >&2
exit 2
fi
# Run regression test using Python utility
uv run benchmark-utils run-regression-test \
--baseline "baseline-artifact/baseline_results.txt"
echo " Baseline source: ${BASELINE_ORIGIN:-unknown}"
echo " Baseline origin: ${BASELINE_ORIGIN:-unknown}"
# Run regression test using Python utility
uv run benchmark-utils run-regression-test \
--baseline "baseline-artifact/baseline_results.txt"
🤖 Prompt for AI Agents
In .github/workflows/benchmarks.yml around lines 278 to 284, the echo statement
prints BASELINE_SOURCE which is never set (the variable earlier is
BASELINE_ORIGIN); change the echo to reference BASELINE_ORIGIN (or set
BASELINE_SOURCE where intended) so the workflow emits the correct baseline
metadata value; update the line to echo "   Baseline origin:
${BASELINE_ORIGIN:-unknown}" (or ensure the variable name matches your earlier
definition).

fi
echo "tag_name=$TAG_NAME" >> "$GITHUB_OUTPUT"
echo "Final tag name: $TAG_NAME"
run: uv run benchmark-utils determine-tag
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Fix step order: uv used before it’s installed.

Move “Install uv” (and “Verify uv”) above “Determine tag information” to avoid a hard failure.

-      - name: Determine tag information
-        id: tag_info
-        run: uv run benchmark-utils determine-tag
-
-      - name: Install uv (Python package manager)
+      - name: Install uv (Python package manager)
         uses: astral-sh/setup-uv@557e51de59eb14aaaba2ed9621916900a91d50c6  # v6.6.1
         with:
           version: "latest"
 
       - name: Verify uv installation
         run: uv --version
+
+      - name: Determine tag information
+        id: tag_info
+        run: uv run benchmark-utils determine-tag

Also applies to: 47-54

🤖 Prompt for AI Agents
.github/workflows/generate-baseline.yml around line 45 (also applies to lines
47-54): the workflow invokes "uv run benchmark-utils determine-tag" before the
"Install uv" and "Verify uv" steps, causing a hard failure; reorder the steps so
the "Install uv" and its "Verify uv" step appear before the "Determine tag
information" run step (and update any dependent step names/ids if necessary) so
uv is installed and verified prior to running the determine-tag command.

Comment on lines 59 to 66
echo "🚀 Generating performance baseline for tag $TAG_NAME"

# Run baseline generation using Python CLI tool with CI performance suite

# Generate baseline using Python CLI tool
uv run benchmark-utils generate-baseline \
|| uv run python -m scripts.benchmark_utils generate-baseline

# Verify baseline was created
if [ ! -f "baseline-artifact/baseline_results.txt" ]; then
echo "❌ Baseline generation failed - no baseline file created"
exit 1
fi

--output "baseline-artifact/baseline-$TAG_NAME.txt" \
--tag "$TAG_NAME"

echo "✅ Baseline generated successfully"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Ensure output directory exists before writing baseline.

Create baseline-artifact/ explicitly to avoid failures if the CLI doesn’t create parent dirs.

           echo "🚀 Generating performance baseline for tag $TAG_NAME"
-          
+          mkdir -p baseline-artifact
           # Generate baseline using Python CLI tool
           uv run benchmark-utils generate-baseline \
             --output "baseline-artifact/baseline-$TAG_NAME.txt" \
             --tag "$TAG_NAME"
-          
+
           echo "✅ Baseline generated successfully"
-          
+
           # Display baseline summary
           uv run benchmark-utils display-summary \
             --baseline "baseline-artifact/baseline-$TAG_NAME.txt"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
echo "🚀 Generating performance baseline for tag $TAG_NAME"
# Run baseline generation using Python CLI tool with CI performance suite
# Generate baseline using Python CLI tool
uv run benchmark-utils generate-baseline \
|| uv run python -m scripts.benchmark_utils generate-baseline
# Verify baseline was created
if [ ! -f "baseline-artifact/baseline_results.txt" ]; then
echo "❌ Baseline generation failed - no baseline file created"
exit 1
fi
--output "baseline-artifact/baseline-$TAG_NAME.txt" \
--tag "$TAG_NAME"
echo "✅ Baseline generated successfully"
echo "🚀 Generating performance baseline for tag $TAG_NAME"
mkdir -p baseline-artifact
# Generate baseline using Python CLI tool
uv run benchmark-utils generate-baseline \
--output "baseline-artifact/baseline-$TAG_NAME.txt" \
--tag "$TAG_NAME"
echo "✅ Baseline generated successfully"
# Display baseline summary
uv run benchmark-utils display-summary \
--baseline "baseline-artifact/baseline-$TAG_NAME.txt"
🤖 Prompt for AI Agents
.github/workflows/generate-baseline.yml around lines 59 to 66: the workflow
writes to baseline-artifact/baseline-$TAG_NAME.txt but doesn't ensure the
baseline-artifact directory exists; before running the Python CLI add a step to
create the directory (e.g., run mkdir -p baseline-artifact) so the output path
is guaranteed to exist and the job won't fail if the CLI doesn't create parent
dirs.

Comment on lines +99 to +104
if: ${{ !contains(github.event.inputs.benchmark_filter, 'memory') }}
run: |
echo "Running memory-specific profiling with allocation tracking..."
cargo bench --bench profiling_suite --features count-allocations -- memory_profiling \
2>&1 | tee profiling-results/memory_profiling_detailed.log

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Guard against missing inputs in non-manual triggers.

contains() on an empty input can be brittle. Make the condition explicit.

-    - name: Run profiling with memory allocation tracking (if not already done)
-      if: ${{ !contains(github.event.inputs.benchmark_filter, 'memory') }}
+    - name: Run profiling with memory allocation tracking (if not already done)
+      if: ${{ github.event_name != 'workflow_dispatch' || !contains(github.event.inputs.benchmark_filter, 'memory') }}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if: ${{ !contains(github.event.inputs.benchmark_filter, 'memory') }}
run: |
echo "Running memory-specific profiling with allocation tracking..."
cargo bench --bench profiling_suite --features count-allocations -- memory_profiling \
2>&1 | tee profiling-results/memory_profiling_detailed.log
- name: Run profiling with memory allocation tracking (if not already done)
if: ${{ github.event_name != 'workflow_dispatch' || !contains(github.event.inputs.benchmark_filter, 'memory') }}
run: |
echo "Running memory-specific profiling with allocation tracking..."
cargo bench --bench profiling_suite --features count-allocations -- memory_profiling \
2>&1 | tee profiling-results/memory_profiling_detailed.log
🤖 Prompt for AI Agents
.github/workflows/profiling-benchmarks.yml lines 99-104: the current if uses
contains(...) directly on github.event.inputs which can be undefined for
non-manual triggers; update the condition to first ensure github.event.inputs
(or the specific benchmark_filter input) exists before calling contains, i.e.,
add an explicit null/existence check for github.event.inputs (or
benchmark_filter) and only then negate contains(...) so the run only executes
when inputs are present and the filter does not include "memory".

Comment on lines +118 to +149
/// Generate points according to the specified distribution
fn generate_points_by_distribution<const D: usize>(
count: usize,
distribution: PointDistribution,
seed: u64,
) -> Vec<Point<f64, D>> {
match distribution {
PointDistribution::Random => {
generate_random_points_seeded(count, (-100.0, 100.0), seed).unwrap()
}
PointDistribution::Grid => {
// Calculate points per dimension to get approximately `count` points total
#[allow(
clippy::cast_precision_loss,
clippy::cast_possible_truncation,
clippy::cast_sign_loss
)]
let points_per_dim = ((count as f64).powf(1.0 / D as f64).ceil() as usize).max(2);
generate_grid_points(points_per_dim, 10.0, [0.0; D]).unwrap()
}
PointDistribution::PoissonDisk => {
let min_distance = match D {
2 => 5.0, // 2D: reasonable spacing for [-100, 100] bounds
3 => 8.0, // 3D: slightly larger spacing
4 => 12.0, // 4D: larger spacing for higher dimensions
5 => 15.0, // 5D: even larger spacing
_ => 20.0, // Higher dimensions: very large spacing
};
generate_poisson_points(count, (-100.0, 100.0), min_distance, seed).unwrap()
}
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid pathological Poisson-disk sampling at large counts.

Rejection sampling with max_attempts = n*30 will explode at high n and higher D. Fall back to Random (or reduce min distance) beyond safe thresholds.

         PointDistribution::PoissonDisk => {
-            let min_distance = match D {
+            // Clamp feasibility: large counts with Poisson become prohibitively slow.
+            let poisson_cap = match D {
+                2 => 50_000,
+                3 => 20_000,
+                4 => 10_000,
+                5 => 5_000,
+                _ => 1_000,
+            };
+            if count > poisson_cap {
+                return generate_random_points_seeded(count, (-100.0, 100.0), seed).unwrap();
+            }
+
+            let min_distance = match D {
                 2 => 5.0,  // 2D: reasonable spacing for [-100, 100] bounds
                 3 => 8.0,  // 3D: slightly larger spacing
                 4 => 12.0, // 4D: larger spacing for higher dimensions
                 5 => 15.0, // 5D: even larger spacing
                 _ => 20.0, // Higher dimensions: very large spacing
             };
             generate_poisson_points(count, (-100.0, 100.0), min_distance, seed).unwrap()
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/// Generate points according to the specified distribution
fn generate_points_by_distribution<const D: usize>(
count: usize,
distribution: PointDistribution,
seed: u64,
) -> Vec<Point<f64, D>> {
match distribution {
PointDistribution::Random => {
generate_random_points_seeded(count, (-100.0, 100.0), seed).unwrap()
}
PointDistribution::Grid => {
// Calculate points per dimension to get approximately `count` points total
#[allow(
clippy::cast_precision_loss,
clippy::cast_possible_truncation,
clippy::cast_sign_loss
)]
let points_per_dim = ((count as f64).powf(1.0 / D as f64).ceil() as usize).max(2);
generate_grid_points(points_per_dim, 10.0, [0.0; D]).unwrap()
}
PointDistribution::PoissonDisk => {
let min_distance = match D {
2 => 5.0, // 2D: reasonable spacing for [-100, 100] bounds
3 => 8.0, // 3D: slightly larger spacing
4 => 12.0, // 4D: larger spacing for higher dimensions
5 => 15.0, // 5D: even larger spacing
_ => 20.0, // Higher dimensions: very large spacing
};
generate_poisson_points(count, (-100.0, 100.0), min_distance, seed).unwrap()
}
}
}
PointDistribution::PoissonDisk => {
// Clamp feasibility: large counts with Poisson become prohibitively slow.
let poisson_cap = match D {
2 => 50_000,
3 => 20_000,
4 => 10_000,
5 => 5_000,
_ => 1_000,
};
if count > poisson_cap {
return generate_random_points_seeded(count, (-100.0, 100.0), seed).unwrap();
}
let min_distance = match D {
2 => 5.0, // 2D: reasonable spacing for [-100, 100] bounds
3 => 8.0, // 3D: slightly larger spacing
4 => 12.0, // 4D: larger spacing for higher dimensions
5 => 15.0, // 5D: even larger spacing
_ => 20.0, // Higher dimensions: very large spacing
};
generate_poisson_points(count, (-100.0, 100.0), min_distance, seed).unwrap()
}
🤖 Prompt for AI Agents
In benches/profiling_suite.rs around lines 118 to 149, the Poisson-disk branch
can trigger pathological rejection sampling for large counts or high dimensions;
add a guard that estimates whether Poisson sampling is feasible (e.g., compare
desired count and min_distance against the volume/area of the domain or use a
simple heuristic like count > threshold_per_dim[D] or min_distance^D * count >
domain_volume * safety_factor) and when infeasible either fall back to
generate_random_points_seeded(count, (-100.0, 100.0), seed) or reduce
min_distance (e.g., scale it down until feasible) before calling
generate_poisson_points; implement the heuristic thresholds per D, log or
comment the fallback choice, and ensure unwrap() still applies to the chosen
generator.

Comment on lines +83 to +89
[tool.tarpaulin]
# Exclude benchmarks and examples from coverage analysis
exclude = [
"benches/*",
"examples/*",
]

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

❓ Verification inconclusive

Tarpaulin config here is likely ignored; move to tarpaulin.toml.

cargo-tarpaulin reads config from .tarpaulin.toml/tarpaulin.toml (or CLI/env), not from [tool.tarpaulin] in Cargo.toml. The exclude rules here probably have no effect. Recommend removing this section and adding a tarpaulin.toml.

Docs for config location: see cargo-tarpaulin README and docs.rs. (github.com, docs.rs)

Apply in Cargo.toml:

-[tool.tarpaulin]
-# Exclude benchmarks and examples from coverage analysis
-exclude = [
-    "benches/*",
-    "examples/*",
-]

Create tarpaulin.toml at repo root:

# Exclude benches and examples from coverage
[test_config]
exclude-files = ["benches/*", "examples/*"]

Optionally, keep Codecov’s benches ignore as defense-in-depth.


Move Tarpaulin config to tarpaulin.toml
cargo-tarpaulin only loads settings from an external .tarpaulin.toml or tarpaulin.toml, not from Cargo.toml’s [tool.tarpaulin] section (docs.rs, github.com). Remove the block below and add a tarpaulin.toml at the repository root:

-[tool.tarpaulin]
-# Exclude benchmarks and examples from coverage analysis
-exclude = [
-    "benches/*",
-    "examples/*",
-]

Create tarpaulin.toml:

# Exclude benches and examples from coverage
[test_config]
exclude-files = ["benches/*", "examples/*"]
🤖 Prompt for AI Agents
In Cargo.toml around lines 83–89, the [tool.tarpaulin] configuration is
ineffective because cargo-tarpaulin only reads an external tarpaulin.toml;
remove the entire [tool.tarpaulin] block from Cargo.toml and add a new
tarpaulin.toml at the repository root containing a [test_config] section with
exclude-files = ["benches/*", "examples/*"]; commit both the removal from
Cargo.toml and the new tarpaulin.toml file.

Comment on lines +1188 to 1190
project_root = find_project_root()
execute_command(args, project_root)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Don’t require Cargo.toml for non-Rust workflow commands

determine-tag, create-metadata, etc., shouldn’t fail if run outside the repo root. Resolve project root only for baseline/compare commands.

Apply:

-    project_root = find_project_root()
-    execute_command(args, project_root)
+    # Only Rust baseline/compare need the project root
+    if args.command in ("generate-baseline", "compare"):
+        project_root = find_project_root()
+    else:
+        project_root = Path.cwd()
+    execute_command(args, project_root)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
project_root = find_project_root()
execute_command(args, project_root)
# Only Rust baseline/compare need the project root
if args.command in ("generate-baseline", "compare"):
project_root = find_project_root()
else:
project_root = Path.cwd()
execute_command(args, project_root)
🤖 Prompt for AI Agents
In scripts/benchmark_utils.py around lines 1188-1190, currently project_root is
always resolved before executing the command which causes non-Rust workflow
commands (determine-tag, create-metadata, etc.) to fail when run outside the
repo root; change the logic to only call find_project_root for commands that
need it (e.g., "baseline" and "compare") and leave project_root as None or the
current working directory for others, then pass that value into execute_command;
if execute_command requires a non-optional project_root, update its
signature/callers to accept an optional project_root and handle None
appropriately so non-Rust commands won’t require Cargo.toml or repo-root
resolution.

Comment on lines +1299 to +1352
// Safely convert dimension to u32 for pow function
let d_u32 =
u32::try_from(D).map_err(|_| RandomPointGenerationError::RandomGenerationFailed {
min: "0".to_string(),
max: format!("{}", points_per_dim - 1),
details: format!("Dimension {D} is too large to fit in u32"),
})?;
let total_points = points_per_dim.pow(d_u32);
let mut points = Vec::with_capacity(total_points);

// Generate all combinations of grid indices
#[allow(clippy::items_after_statements)]
fn generate_indices(
current: &mut Vec<usize>,
dim: usize,
max_dims: usize,
points_per_dim: usize,
all_indices: &mut Vec<Vec<usize>>,
) {
if dim == max_dims {
all_indices.push(current.clone());
return;
}

for i in 0..points_per_dim {
current.push(i);
generate_indices(current, dim + 1, max_dims, points_per_dim, all_indices);
current.pop();
}
}

let mut all_indices = Vec::new();
let mut current = Vec::new();
generate_indices(&mut current, 0, D, points_per_dim, &mut all_indices);

// Convert indices to coordinates
for indices in all_indices {
let mut coords = [T::zero(); D];
for (dim, &index) in indices.iter().enumerate() {
// Convert usize index to coordinate scalar safely
let index_as_scalar = safe_usize_to_scalar::<T>(index).map_err(|_| {
RandomPointGenerationError::RandomGenerationFailed {
min: "0".to_string(),
max: format!("{}", points_per_dim - 1),
details: format!("Failed to convert grid index {index} to coordinate type"),
}
})?;
coords[dim] = offset[dim] + index_as_scalar * spacing;
}
points.push(Point::new(coords));
}

Ok(points)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid materializing all index combinations (memory blow-up)

Building all_indices allocates O(points_per_dim^D) vectors of indices before pushing any points. Stream indices and emit coordinates in a single pass to cap memory to O(1) extra.

Apply:

-    // Generate all combinations of grid indices
-    #[allow(clippy::items_after_statements)]
-    fn generate_indices(
-        current: &mut Vec<usize>,
-        dim: usize,
-        max_dims: usize,
-        points_per_dim: usize,
-        all_indices: &mut Vec<Vec<usize>>,
-    ) {
-        if dim == max_dims {
-            all_indices.push(current.clone());
-            return;
-        }
-
-        for i in 0..points_per_dim {
-            current.push(i);
-            generate_indices(current, dim + 1, max_dims, points_per_dim, all_indices);
-            current.pop();
-        }
-    }
-
-    let mut all_indices = Vec::new();
-    let mut current = Vec::new();
-    generate_indices(&mut current, 0, D, points_per_dim, &mut all_indices);
-
-    // Convert indices to coordinates
-    for indices in all_indices {
-        let mut coords = [T::zero(); D];
-        for (dim, &index) in indices.iter().enumerate() {
-            // Convert usize index to coordinate scalar safely
-            let index_as_scalar = safe_usize_to_scalar::<T>(index).map_err(|_| {
-                RandomPointGenerationError::RandomGenerationFailed {
-                    min: "0".to_string(),
-                    max: format!("{}", points_per_dim - 1),
-                    details: format!("Failed to convert grid index {index} to coordinate type"),
-                }
-            })?;
-            coords[dim] = offset[dim] + index_as_scalar * spacing;
-        }
-        points.push(Point::new(coords));
-    }
+    // Odometer-style index generation without storing combinations
+    let mut idxs = [0usize; D];
+    for _ in 0..total_points {
+        let mut coords = [T::zero(); D];
+        for dim in 0..D {
+            let i = idxs[dim];
+            let i_scalar = safe_usize_to_scalar::<T>(i).map_err(|_| {
+                RandomPointGenerationError::RandomGenerationFailed {
+                    min: "0".to_string(),
+                    max: format!("{}", points_per_dim.saturating_sub(1)),
+                    details: format!("Failed to convert grid index {i} to coordinate type"),
+                }
+            })?;
+            coords[dim] = offset[dim] + i_scalar * spacing;
+        }
+        points.push(Point::new(coords));
+        // increment odometer
+        for dim in (0..D).rev() {
+            idxs[dim] += 1;
+            if idxs[dim] < points_per_dim {
+                break;
+            }
+            idxs[dim] = 0;
+        }
+    }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Safely convert dimension to u32 for pow function
let d_u32 =
u32::try_from(D).map_err(|_| RandomPointGenerationError::RandomGenerationFailed {
min: "0".to_string(),
max: format!("{}", points_per_dim - 1),
details: format!("Dimension {D} is too large to fit in u32"),
})?;
let total_points = points_per_dim.pow(d_u32);
let mut points = Vec::with_capacity(total_points);
// Generate all combinations of grid indices
#[allow(clippy::items_after_statements)]
fn generate_indices(
current: &mut Vec<usize>,
dim: usize,
max_dims: usize,
points_per_dim: usize,
all_indices: &mut Vec<Vec<usize>>,
) {
if dim == max_dims {
all_indices.push(current.clone());
return;
}
for i in 0..points_per_dim {
current.push(i);
generate_indices(current, dim + 1, max_dims, points_per_dim, all_indices);
current.pop();
}
}
let mut all_indices = Vec::new();
let mut current = Vec::new();
generate_indices(&mut current, 0, D, points_per_dim, &mut all_indices);
// Convert indices to coordinates
for indices in all_indices {
let mut coords = [T::zero(); D];
for (dim, &index) in indices.iter().enumerate() {
// Convert usize index to coordinate scalar safely
let index_as_scalar = safe_usize_to_scalar::<T>(index).map_err(|_| {
RandomPointGenerationError::RandomGenerationFailed {
min: "0".to_string(),
max: format!("{}", points_per_dim - 1),
details: format!("Failed to convert grid index {index} to coordinate type"),
}
})?;
coords[dim] = offset[dim] + index_as_scalar * spacing;
}
points.push(Point::new(coords));
}
Ok(points)
}
// Safely convert dimension to u32 for pow function
let d_u32 =
u32::try_from(D).map_err(|_| RandomPointGenerationError::RandomGenerationFailed {
min: "0".to_string(),
max: format!("{}", points_per_dim - 1),
details: format!("Dimension {D} is too large to fit in u32"),
})?;
let total_points = points_per_dim.pow(d_u32);
let mut points = Vec::with_capacity(total_points);
// Odometer-style index generation without storing combinations
let mut idxs = [0usize; D];
for _ in 0..total_points {
let mut coords = [T::zero(); D];
for dim in 0..D {
let i = idxs[dim];
let i_scalar = safe_usize_to_scalar::<T>(i).map_err(|_| {
RandomPointGenerationError::RandomGenerationFailed {
min: "0".to_string(),
max: format!("{}", points_per_dim.saturating_sub(1)),
details: format!("Failed to convert grid index {i} to coordinate type"),
}
})?;
coords[dim] = offset[dim] + i_scalar * spacing;
}
points.push(Point::new(coords));
// increment odometer
for dim in (0..D).rev() {
idxs[dim] += 1;
if idxs[dim] < points_per_dim {
break;
}
idxs[dim] = 0;
}
}
Ok(points)
}

Comment on lines +1306 to +1307
let total_points = points_per_dim.pow(d_u32);
let mut points = Vec::with_capacity(total_points);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Guard capacity computation against overflow

points_per_dim.pow(D as u32) can overflow usize and mis-size the Vec. Use checked multiplication to derive capacity safely and error if the grid is too large.

Apply:

-    let total_points = points_per_dim.pow(d_u32);
+    // Compute total_points with overflow checking
+    let mut total_points: usize = 1;
+    for _ in 0..D {
+        total_points = total_points.checked_mul(points_per_dim).ok_or_else(|| {
+            RandomPointGenerationError::RandomGenerationFailed {
+                min: "0".to_string(),
+                max: format!("{}", points_per_dim.saturating_sub(1)),
+                details: format!("grid too large: {points_per_dim}^{D} exceeds addressable capacity"),
+            }
+        })?;
+    }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In src/geometry/util.rs around lines 1306-1307, computing total_points with
points_per_dim.pow(d_u32) can overflow and under-allocate the Vec; replace that
with a checked multiplication loop (start total = 1usize; for _ in 0..d_u32 {
total = total.checked_mul(points_per_dim).ok_or_else(|| /* return or propagate a
clear error like GridTooLarge */)? } ) and then call Vec::with_capacity(total);
ensure you propagate or return an error when checked_mul fails (or panic with a
clear message) so oversized grids are rejected instead of silently mis-sizing
the Vec.

…0.4.3

- Add allocation counter infrastructure with count-allocations feature flag
- Implement memory tracking for triangulation and convex hull operations
- Add profiling benchmarks in GitHub Actions workflow
- Optimize collections with FxHashMap/FxHashSet for better performance
- Add domain-specific collection types and small buffer optimizations
- Clean up examples by removing test functions (convert to pure demonstrations)
- Update WARP.md guidelines and project documentation
- Add comprehensive memory analysis examples across dimensions (2D-5D)
@acgetchell acgetchell closed this Sep 8, 2025
@acgetchell acgetchell deleted the refactor/profiling branch September 8, 2025 01:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation rust Pull requests that update rust code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant