V2.0: torch.compile support, Refactoring, Enhanced Factored Mode, Various Improvements #12

Koratahiu · 2025-12-14T18:58:03Z

This is a major update to the library, introducing:

Compiled Optimizers

Enables torch.compile for the optimizer step (via the compiled_optimizer option) across all adv optimizers.

Thanks to @dxqb for initial support in torch.compile support #6!

When to use:

Features with noticeable overhead: With torch.compile, the cost of complex features becomes negligible:
- OrthoGrad: Mitigates the ~33% overhead seen with small batch sizes.
- 1-bit Factored mode: Reduces calculation overhead.
- 3-state optimizers (e.g., AdEMAMix): Efficiently handles the additional states.
Full Finetuning: Mitigates optimizer-side bottlenecks in larger models.
Orthogonal Optimizers: Reduces the cost of orthogonalization ops in Muon and AdaMuon.

1-bit Factored Changes (`nnmf_factor`)

Code Centralization: Greatly reduced code duplication; all logic now resides in adv_optm\util\factorization_util.py.
Reduced Temporary Tensors: Temporary tensors are significantly reduced, lowering VRAM spikes in factored mode.

Muon Variants

Newton-Schulz Iteration:
- Pre-allocates matrices for performance.
- Uses reference swapping to prevent reallocation.
AdaMuon Auto Projection: Inspired by the paper "Lion Secretly Solves Constrained Optimization, As Lyapunov Predicts," the AdaMuon sign update is now enhanced for better performance on Conv2d layers (ndim=4).

etc.

… into SCION

… into compile_3

… into v2.0

Koratahiu added 30 commits December 5, 2025 02:27

initial

d378901

Merge branch 'main' of https://github.com/Koratahiu/Advanced_Optimizers…

81df47f

… into SCION

initial pt.2

035c99f

bump

fc6d154

compile Muon

1b5868e

remove leftovers

2d625a1

space

9b411a9

LR pass

83dcc94

Adamuon and mars

e30f780

revert

4913400

Merge branch 'main' of https://github.com/Koratahiu/Advanced_Optimizers…

bc94848

… into compile_3

Merge branch 'main' of https://github.com/Koratahiu/Advanced_Optimizers…

81c9971

… into compile_3

initial Prodigy/AdamW compile

3f08581

pre-group d for prodigy

ffdd069

Muon enhancements

e6659c5

Code deduplications

0fed5e2

Code deduplications v2

9243b4b

revert coefs presets

c16aae6

Code deduplications - factored, grams, cautious

f59ffdd

refactor cans

100c6c4

refactor cans v2

1ab63df

Merge branch 'main' of https://github.com/Koratahiu/Advanced_Optimizers…

a7fe017

… into v2.0

remove leftover

f6d855e

fix NorMuon

50a873a

refactor prodigy_adv

b4aaf1e

fix

f79538e

fix Adopt

dbf9ebc

fix muon

4bdf82c

remove leftover

49cef27

BF16 with compile pt.1

2063804

Koratahiu mentioned this pull request Dec 24, 2025

Advanced Optimizers 2.0: torch.compile support, enhancements, etc. Nerogar/OneTrainer#1224

Merged

Koratahiu added 27 commits December 27, 2025 21:01

Improved K-b

529d1f4

add init

d0c68e8

dev2

bd8e5ef

Merge branch 'main' of https://github.com/Koratahiu/Advanced_Optimizers…

2289ea5

… into v2.0

add p.device for k-b auxadam

d392163

More robust accumulate_gradient_sq_norm

2e47ccc

fix atan2 NorMuon AdaMuon

f128b81

Workaround muon lr

c32b6b6

LR workaround

d8a75b3

devb3

87ccf9b

dev4 with bugfixes

5d20a83

Muon variants rework

807634d

dev5

5eb8e06

fix prodigy, as_tensor in fp64

371d34d

revert

e4f7fc0

print AdaMuon use_atan2 with NorMuon

e489490

fix

95c8bb7

del NorMuon temp tensor

b906085

fix mars-m with factored mode

facc04a

align approx mars with Simplified_AdEMAMix

1133e94

fix factored lion prodigy

a87c807

add lion workaround

ca561a8

better factored 2nd moment

1cbe858

fx

7d72753

remove comment

d4fbb3c

stable 2.0

856f607

fix prod

93e9300

Koratahiu merged commit 5a1a7fa into main Jan 5, 2026

Koratahiu deleted the v2.0 branch January 5, 2026 18:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

V2.0: torch.compile support, Refactoring, Enhanced Factored Mode, Various Improvements #12

V2.0: torch.compile support, Refactoring, Enhanced Factored Mode, Various Improvements #12

Uh oh!

Koratahiu commented Dec 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

V2.0: torch.compile support, Refactoring, Enhanced Factored Mode, Various Improvements #12

V2.0: torch.compile support, Refactoring, Enhanced Factored Mode, Various Improvements #12

Uh oh!

Conversation

Koratahiu commented Dec 14, 2025

Compiled Optimizers

When to use:

1-bit Factored Changes (nnmf_factor)

Muon Variants

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

1-bit Factored Changes (`nnmf_factor`)