Skip to content

disable fused lm head chunking by default#1904

Merged
samsja merged 4 commits intomainfrom
sami/disable-chunking-by-default
Feb 27, 2026
Merged

disable fused lm head chunking by default#1904
samsja merged 4 commits intomainfrom
sami/disable-chunking-by-default

Conversation

@samsja
Copy link
Member

@samsja samsja commented Feb 26, 2026

Summary

  • Change fused_lm_head_chunk_size default from "auto" to "disabled"

🤖 Generated with Claude Code


Note

Medium Risk
Changes a model-level default (fused_lm_head_chunk_size) that affects training execution and performance/memory characteristics across runs. CI/integration tests now pin an explicit chunk size, but other consumers may see behavior changes if relying on the previous default.

Overview
Disables fused LM head chunking by default by changing ModelConfig.fused_lm_head_chunk_size from "auto" to "disabled".

Bench/CI flows are updated to explicitly opt in when needed: run_single_benchmark.py accepts/forwards --model.fused-lm-head-chunk-size, the RL multi-run integration trainer.toml pins fused_lm_head_chunk_size = 8192, and benchmark regression tests pass --fused-lm-head-chunk-size 8192 to keep results stable.

Written by Cursor Bugbot for commit fc7621d. This will update automatically on new commits. Configure here.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
),
),
] = "auto"
] = "disabled"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing changelog for config default change

Low Severity

The default for model.fused_lm_head_chunk_size changed in src/prime_rl/configs/trainer.py, but there is no corresponding CHANGELOG.md entry for this config behavior change. This makes the default change harder to discover for users relying on documented config migrations.

Fix in Cursor Fix in Web

Triggered by project rule: BugBot Instructions

65536 OOMs on A6000 without fused lm head chunking.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

samsja and others added 2 commits February 26, 2026 23:51
The benchmark baselines were generated with chunking enabled, so
the regression test must explicitly enable it regardless of the
default.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This test uses seq_len=65536 which needs chunking to avoid OOM on
A6000 GPUs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@samsja samsja merged commit 753a728 into main Feb 27, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant