Conversation
There was a problem hiding this comment.
💡 Codex Review
The new logic assigns weight_decay: 0.0 to no‑decay params earlier in this method, but group_param_groups still only splits when the LR changes. When two adjacent groups share the same LR but have different weight_decay values, they get merged here and the merged group keeps the first group's weight_decay. That means either no‑decay params get decayed or decayable params lose decay, depending on ordering. This is a regression introduced by adding weight_decay: 0.0 without updating the grouping condition; other EoMT models in this commit updated the condition to check weight_decay too.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Pull request overview
This PR implements selective weight decay exclusion for specific parameter types (bias, normalization layers, tokens, etc.) and reduces logging overhead by selectively logging learning rate and weight decay values.
Changes:
- Added a new
get_weight_decay_parametersfunction to categorize parameters based on whether they should have weight decay applied - Updated optimizer creation in multiple task models to use separate parameter groups with different weight decay values
- Added conditional logging for optimizer parameter groups to reduce logging overhead
- Updated the warmup_steps calculation in picodet to prevent it from exceeding total steps
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/lightly_train/_optim/optimizer_helpers.py | Added new function to categorize parameters for weight decay and updated imports |
| tests/_optim/test_optimizer_helpers.py | Added comprehensive test for the new weight decay parameter categorization function |
| src/lightly_train/_task_models/picodet_object_detection/train_model.py | Updated optimizer to use separate parameter groups and fixed warmup_steps calculation |
| src/lightly_train/_task_models/dinov3_eomt_semantic_segmentation/train_model.py | Updated optimizer to apply selective weight decay and added conditional logging |
| src/lightly_train/_task_models/dinov3_eomt_panoptic_segmentation/train_model.py | Updated optimizer and grouping logic to handle different weight decay values correctly |
| src/lightly_train/_task_models/dinov3_eomt_instance_segmentation/train_model.py | Updated optimizer and grouping logic to handle different weight decay values correctly |
| src/lightly_train/_task_models/dinov2_linear_semantic_segmentation/train_model.py | Replaced lightly library's function with internal implementation |
| src/lightly_train/_task_models/dinov2_eomt_semantic_segmentation/train_model.py | Updated optimizer and grouping logic to handle different weight decay values correctly |
| src/lightly_train/_commands/train_task.py | Added conditional logging to respect the "log" flag in parameter groups |
| CHANGELOG.md | Documented the new feature |
The weight decay / no weight decay helper function is in LightlySSL although not with all the functionality. We could definitely also move it there. |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c62cad255a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
What has changed and why?
This PR updates all models except LTDETR for which I'll create a follow-up PR.
How has it been tested?
Did you update CHANGELOG.md?
Did you update the documentation?