huggingface / lighteval Public

Notifications You must be signed in to change notification settings
Fork 83
Star 744

Code
Issues 55
Pull requests 15
Actions
Projects
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Wiki
Security
Insights

Issues: huggingface/lighteval

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

55 Open 85 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[BUG] ImportError for custom tasks bug

Something isn't working

#370 opened Oct 18, 2024 by Bachstelze

[FT] Using lighteval to evaluate a model on a single sample, how? feature request

New feature/request

#365 opened Oct 17, 2024 by dxlong2000

[FT] Pipeline does not fully handle trust_remote_code to load dataset feature request

New feature/request

#362 opened Oct 15, 2024 by Sanahm

[FT] More general approach than output_regex to model answer extraction feature request

New feature/request

#360 opened Oct 14, 2024 by sadra-barikbin

[FT] Single token completion loglikelihood auto-detection feature request

New feature/request

low prio

#355 opened Oct 10, 2024 by hynky1999

[BUG] batch_size = auto:1 issue bug

Something isn't working

#353 opened Oct 9, 2024 by alozowski

[BUG] assertion error assert text[: len(left)] == left on MATH wen Qwen-Math-2.5 bug

Something isn't working

#345 opened Oct 7, 2024 by d1shs0ap

[EVAL] Add ArenaHardAuto new task prio

#325 opened Sep 23, 2024 by lewtun

[EVAL] Add RewardBench new task

#324 opened Sep 23, 2024 by lewtun

[FT] Enable batched dataset_filter feature request

New feature/request

#322 opened Sep 21, 2024 by chuandudx

[BUG] AttributeError: 'str' object has no attribute 'category' bug

Something isn't working

#320 opened Sep 18, 2024 by Vanessa-Taing

[FT] LLM-as-judge example that doesn't require OPENAI_KEY or pro subscription of HF feature request

New feature/request

#318 opened Sep 18, 2024 by chuandudx

[FT] pass trust_remote_code as flag for loading datasets with custom code feature request

New feature/request

#314 opened Sep 16, 2024 by chuandudx

[FT] Provide an interface for easier edit of parametrizable metrics feature request

New feature/request

#312 opened Sep 16, 2024 by clefourrier

[BUG] Errors when using BERTScore for evaluation bug

Something isn't working

#310 opened Sep 16, 2024 by chuandudx

[FT] Remove obsolete config properties (frozen, output_regex) feature request

New feature/request

#305 opened Sep 13, 2024 by hynky1999

[FT] Task groupings as separate tasks feature request

New feature/request

#294 opened Sep 6, 2024 by hynky1999

[BUG] Question on batch preparation in MMLU evaluation bug

Something isn't working

#288 opened Sep 4, 2024 by JefferyChen453

[BUG] Nanotron batch detection doesn't work bug

Something isn't working

#286 opened Sep 3, 2024 by hynky1999

[BUG] Can not load deutsche-telekom/Ger-RAG-eval dataset. bug

Something isn't working

#278 opened Aug 23, 2024 by PhilipMay

[BUG] Zero accuracy in Hellaswag for Llama-2-7b (using 8bit quantization) bug

Something isn't working

#275 opened Aug 21, 2024 by rankofootball

[FT] Open ai endpoint new task

#273 opened Aug 21, 2024 by Pommel4711

[FT] IFEval and extended tasks are not in the test suite feature request

New feature/request

#261 opened Aug 14, 2024 by clefourrier

[FT] Detect max length from perplexity feature request

New feature/request

low prio

#257 opened Aug 13, 2024 by clefourrier

[FT] Add tool usage benchmarks feature request

New feature/request

#256 opened Aug 13, 2024 by NathanHB

Previous 1 2 3 Next

Previous Next

ProTip! no:milestone will show everything without a milestone.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly