Skip to content

[Parent] Docketbert pretraining #74

@nadahlberg

Description

@nadahlberg

Pretrain a domain-adapted modernbert for docket entry description text.

Run initial experiments on a small set of ~6M entries to test different strategies including:

  • Vanilla masked language modeling with modernbert models
  • Finetuning small models from scratch
  • Finetuning models with a distillation loss + MLM
  • Finetuning "sliced" variants of modernbert models

Then run a larger training run with ~50M entries using the best strategies.

Sub-issues

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

Ancestors 👪

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions