Disabled post_process when calling MCoreBert + added missing add_lm_h… #14548

shaltielshmid · 2025-08-21T14:10:06Z

What does this PR do ?

Fixes inconsistencies in MCoreBertModelWrapper which cause a crash using specific BertConfigs.

Collection: NLP

Changelog

Disable post_process when calling MCoreBert initializer from the wrapper
Added missing config.add_lm_head check.

Pre checks:

[v] Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

[v] Bugfix

…ead check Signed-off-by: shaltielshmid <[email protected]>

Signed-off-by: shaltielshmid <[email protected]>

suiyoubi

LGTM, just curious under what scenario (what specific BERT config) does the original post_process crash in your case.

shaltielshmid · 2025-08-28T19:55:59Z

Sure ! Here is the config:

@dataclass
class NeoBertMediumConfig(MegatronBertBaseConfig):
    """
    NeMo's BERT model variant adjusted to match NeoBERT implementation
    """
    num_layers: int = 28
    normalization: str = 'RMSNorm'
    layernorm_epsilon: float = 1e-6
    position_embedding_type: str = 'rope'
    seq_length: int = 4096
    gated_linear_unit: int = True
    activation_func: Callable = F.silu
    add_lm_head: bool = False
    share_embeddings_and_output_weights: bool = False
    add_bias_linear: bool = False
    add_qkv_bias: bool = False

The issue is caused because the LM-Head doesn't support the RMSNorm normalization, and when post_process isn't disabled, the super() class tries to create an LM-Head and throw a not supported exception.

shaltielshmid · 2025-09-04T14:24:16Z

Hey - following up, is there anything waiting on me here?

shaltielshmid and others added 2 commits August 21, 2025 17:04

Disabled post_process when calling MCoreBert + added missing add_lm_h…

41fabcb

…ead check Signed-off-by: shaltielshmid <[email protected]>

Apply isort and black reformatting

7cda8a1

Signed-off-by: shaltielshmid <[email protected]>

snowmanwwg requested a review from ntajbakhsh August 28, 2025 18:38

snowmanwwg added x-dicta external labels Aug 28, 2025

suiyoubi approved these changes Aug 28, 2025

View reviewed changes

ericharper requested a review from suiyoubi August 28, 2025 20:06

ericharper approved these changes Aug 28, 2025

View reviewed changes

ericharper added the Run CICD label Aug 28, 2025

ericharper temporarily deployed to test August 28, 2025 20:09 — with GitHub Actions Inactive

suiyoubi approved these changes Aug 28, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Disabled post_process when calling MCoreBert + added missing add_lm_h… #14548

Disabled post_process when calling MCoreBert + added missing add_lm_h… #14548

Uh oh!

shaltielshmid commented Aug 21, 2025

Uh oh!

suiyoubi left a comment

Uh oh!

shaltielshmid commented Aug 28, 2025

Uh oh!

shaltielshmid commented Sep 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Disabled post_process when calling MCoreBert + added missing add_lm_h… #14548

Are you sure you want to change the base?

Disabled post_process when calling MCoreBert + added missing add_lm_h… #14548

Uh oh!

Conversation

shaltielshmid commented Aug 21, 2025

What does this PR do ?

Changelog

Uh oh!

suiyoubi left a comment

Choose a reason for hiding this comment

Uh oh!

shaltielshmid commented Aug 28, 2025

Uh oh!

shaltielshmid commented Sep 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants