-
Notifications
You must be signed in to change notification settings - Fork 83
Issues: huggingface/lighteval
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[BUG] ImportError for custom tasks
bug
Something isn't working
#370
opened Oct 18, 2024 by
Bachstelze
[FT] Using lighteval to evaluate a model on a single sample, how?
feature request
New feature/request
#365
opened Oct 17, 2024 by
dxlong2000
[FT] Pipeline does not fully handle New feature/request
trust_remote_code
to load dataset
feature request
#362
opened Oct 15, 2024 by
Sanahm
[FT] More general approach than New feature/request
output_regex
to model answer extraction
feature request
#360
opened Oct 14, 2024 by
sadra-barikbin
[FT] Single token completion loglikelihood auto-detection
feature request
New feature/request
low prio
#355
opened Oct 10, 2024 by
hynky1999
[BUG] assertion error Something isn't working
assert text[: len(left)] == left
on MATH wen Qwen-Math-2.5
bug
#345
opened Oct 7, 2024 by
d1shs0ap
[FT] Enable batched dataset_filter
feature request
New feature/request
#322
opened Sep 21, 2024 by
chuandudx
[BUG] AttributeError: 'str' object has no attribute 'category'
bug
Something isn't working
#320
opened Sep 18, 2024 by
Vanessa-Taing
[FT] LLM-as-judge example that doesn't require OPENAI_KEY or pro subscription of HF
feature request
New feature/request
#318
opened Sep 18, 2024 by
chuandudx
[FT] pass trust_remote_code as flag for loading datasets with custom code
feature request
New feature/request
#314
opened Sep 16, 2024 by
chuandudx
[FT] Provide an interface for easier edit of parametrizable metrics
feature request
New feature/request
#312
opened Sep 16, 2024 by
clefourrier
[BUG] Errors when using BERTScore for evaluation
bug
Something isn't working
#310
opened Sep 16, 2024 by
chuandudx
[FT] Remove obsolete config properties (frozen, output_regex)
feature request
New feature/request
#305
opened Sep 13, 2024 by
hynky1999
[FT] Task groupings as separate tasks
feature request
New feature/request
#294
opened Sep 6, 2024 by
hynky1999
[BUG] Question on batch preparation in MMLU evaluation
bug
Something isn't working
#288
opened Sep 4, 2024 by
JefferyChen453
[BUG] Nanotron batch detection doesn't work
bug
Something isn't working
#286
opened Sep 3, 2024 by
hynky1999
[BUG] Can not load Something isn't working
deutsche-telekom/Ger-RAG-eval
dataset.
bug
#278
opened Aug 23, 2024 by
PhilipMay
[BUG] Zero accuracy in Hellaswag for Llama-2-7b (using 8bit quantization)
bug
Something isn't working
#275
opened Aug 21, 2024 by
rankofootball
[FT] IFEval and extended tasks are not in the test suite
feature request
New feature/request
#261
opened Aug 14, 2024 by
clefourrier
[FT] Detect max length from perplexity
feature request
New feature/request
low prio
#257
opened Aug 13, 2024 by
clefourrier
[FT] Add tool usage benchmarks
feature request
New feature/request
#256
opened Aug 13, 2024 by
NathanHB
Previous Next
ProTip!
no:milestone will show everything without a milestone.