Dataset loading issue for german_rag_evals on Windows #211

Pommel4711 · 2024-07-04T12:47:45Z

Hello, I don't know what I'm doing wrong. I received the following error as indicated in the title.

My input was as shown on this website: :
Hugging Face - Ger-RAG-eval.

python run_evals_accelerate.py ^
  --model_args "pretrained=DiscoResearch/DiscoLM_German_7b_v1" ^
  --tasks "./examples/tasks/all_german_rag_evals.txt" ^
  --override_batch_size 1 ^
  --use_chat_template ^
  --custom_tasks "community_tasks/german_rag_evals.py" ^
  --output_dir "./evals/"

The output was as follows:

INFO:absl:Using default tokenizer.
INFO:absl:Using default tokenizer.
INFO:absl:Using default tokenizer.
INFO:absl:Using default tokenizer.
INFO:absl:Using default tokenizer.
WARNING:bitsandbytes.cextension:The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
WARNING:lighteval.logging.hierarchical_logger:main: (0, Namespace(model_config_path=None, model_args='pretrained=DiscoResearch/DiscoLM_German_7b_v1', max_samples=None, override_batch_size=1, job_id='', output_dir='./evals/', push_results_to_hub=False, save_details=False, push_details_to_hub=False, public_run=False, cache_dir=None, results_org=None, use_chat_template=True, system_prompt=None, dataset_loading_processes=1, custom_tasks='community_tasks/german_rag_evals.py', tasks='./examples/tasks/all_german_rag_evals.txt', num_fewshot_seeds=1)),  {
WARNING:lighteval.logging.hierarchical_logger:  Test all gather {
WARNING:lighteval.logging.hierarchical_logger:    Test gather tensor
WARNING:lighteval.logging.hierarchical_logger:    gathered_tensor tensor([0]), should be [0]
WARNING:lighteval.logging.hierarchical_logger:  } [0:00:00.010932]
WARNING:lighteval.logging.hierarchical_logger:  Creating model configuration {
WARNING:lighteval.logging.hierarchical_logger:  } [0:00:00]
WARNING:lighteval.logging.hierarchical_logger:  Model loading {
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
WARNING:lighteval.logging.hierarchical_logger:    Tokenizer truncation and padding size set to the left side.
WARNING:lighteval.logging.hierarchical_logger:    We are not in a distributed setting. Setting model_parallel to False.
WARNING:lighteval.logging.hierarchical_logger:    Model parallel was set to False, max memory set to None and device map to None
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:31<00:00, 10.60s/it]
WARNING:lighteval.logging.hierarchical_logger:    Using Data Parallelism, putting model on device cpu
WARNING:lighteval.logging.hierarchical_logger:    Model info: ModelInfo(model_name='DiscoResearch/DiscoLM_German_7b_v1', model_sha='560f972f9f735fc9289584b3aa8d75d0e539c44e', model_dtype='torch.bfloat16', model_size='13.49 GB')
WARNING:lighteval.logging.hierarchical_logger:  } [0:00:33.371562]
WARNING:lighteval.logging.hierarchical_logger:  Tasks loading {
WARNING:lighteval.logging.hierarchical_logger:  } [0:00:01.405496]
WARNING:lighteval.logging.hierarchical_logger:} [0:00:34.806011]
Traceback (most recent call last):
File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\datasets\load.py", line 117, in resolve_trust_remote_code
    signal.signal(signal.SIGALRM, _raise_timeout_error)
AttributeError: module 'signal' has no attribute 'SIGALRM'. Did you mean: 'SIGABRT'?

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:\Arbeit\AIUI\RAG Telecom Dataset\lighteval\run_evals_accelerate.py", line 82, in <module>
    main(args)
File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\logging\hierarchical_logger.py", line 166, in wrapper
    return fn(*args, **kwargs)
File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\main_accelerate.py", line 83, in main
    task_dict = Registry(cache_dir=env_config.cache_dir).get_task_dict(
File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\tasks\registry.py", line 135, in get_task_dict
    custom_tasks_module.append(create_custom_tasks_module(custom_tasks=custom_tasks))
File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\tasks\registry.py", line 170, in create_custom_tasks_module
    dataset_module = dataset_module_factory(str(custom_tasks))
File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\datasets\load.py", line 1814, in dataset_module_factory
    ).get_module()
File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\datasets\load.py", line 962, in get_module
    trust_remote_code = resolve_trust_remote_code(self.trust_remote_code, self.name)
File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\datasets\load.py", line 133, in resolve_trust_remote_code
    raise ValueError(
ValueError: The repository for german_rag_evals contains custom code which must be executed to correctly load the dataset. You can inspect the repository content at https://hf.co/datasets/german_rag_evals.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.

I discovered that the argument trust_remote_code=True must be passed as part of the model_args parameter. To fix the issue, I tried the following code, but unfortunately, the error persisted.

python run_evals_accelerate.py ^
--model_args "pretrained=DiscoResearch/DiscoLM_German_7b_v1,trust_remote_code=True" ^
--tasks "./examples/tasks/all_german_rag_evals.txt" ^
--override_batch_size 1 ^
--use_chat_template ^
--custom_tasks "community_tasks/german_rag_evals.py" ^
--output_dir "./evals/"

Maybe this can help.

When I entered the command accelerate env, I received the following output:

Copy-and-paste the text below in your GitHub issue

Accelerate version: 0.31.0
Platform: Windows-10-10.0.19045-SP0
accelerate bash location: C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\Scripts\accelerate.exe
Python version: 3.10.14
Numpy version: 1.26.4
PyTorch version (GPU?): 2.3.1+cpu (False)
PyTorch XPU available: False
PyTorch NPU available: False
PyTorch MLU available: False
System RAM: 15.90 GB
Accelerate default config:
- compute_environment: LOCAL_MACHINE
- distributed_type: NO
- mixed_precision: no
- use_cpu: False
- debug: False
- num_processes: 1
- machine_rank: 0
- num_machines: 1
- gpu_ids: 0
- rdzv_backend: static
- same_network: True
- main_training_function: main
- enable_cpu_affinity: False
- downcast_bf16: no
- tpu_use_cluster: False
- tpu_use_sudo: False
- tpu_env: []

The text was updated successfully, but these errors were encountered:

clefourrier · 2024-07-04T13:36:16Z

Hi!
The ŧrust_remote_code=True message that you get is for the dataset loading, not the dataset.
@PhilipMay, iirc you were the one who added this dataset, can you change it so it does not require trust_remote_code=True ?

PhilipMay · 2024-07-04T14:19:47Z

Yes I can do that @clefourrier .
The problem is that I see no reason why the code thinks it needs to execute custom code to load the dataset.
Everything is "just parquet"...

@Pommel4711 here is the command how I use the evaluation:
https://huggingface.co/datasets/deutsche-telekom/Ger-RAG-eval#usage

It works (worked) without the ŧrust_remote_code for me.

PhilipMay · 2024-07-04T14:22:40Z

Here is a Colab with code that shows that the dataset can be loaded without setting ŧrust_remote_code:
https://colab.research.google.com/drive/1BUORL2_VxORGdIko6SMPqJqZIMUmtR-3?usp=sharing

clefourrier · 2024-07-04T14:40:48Z

Interesting, thanks a lot!

PhilipMay · 2024-07-04T15:14:49Z

@clefourrier and @Pommel4711
I think the root issue is this and not the dataset itself:

File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\datasets\load.py", line 117, in resolve_trust_remote_code
    signal.signal(signal.SIGALRM, _raise_timeout_error)
AttributeError: module 'signal' has no attribute 'SIGALRM'. Did you mean: 'SIGABRT'?

During handling of the above exception, another exception occurred:

Can you please check that?

Pommel4711 · 2024-07-04T16:43:01Z

@PhilipMay
Hey, I'm running this on Windows. Do you use Linux, or do you know how I can fix this problem? I came across this Stack Overflow post that might be related: Python Standard Lib Signal AttributeError: module 'signal' has no attribute 'SIGALRM'.

For reference, I'm running on this commit: a98210fd3a2d1e8bface1c32b.

Thanks for your help!

clefourrier · 2024-07-05T06:17:18Z

Hm, I'm going to ping @lhoestq on this then because it seems like a datasets issue.
Good job seeing this @PhilipMay !

PhilipMay · 2024-07-05T07:50:52Z

Hm, I'm going to ping @lhoestq on this then because it seems like a datasets issue.

Good idea. Thanks.

lhoestq · 2024-07-08T10:17:41Z

OSes that don't support SIGALRM are supported thanks to a try/except - not sure how you managed to get the error related to SIGALRM ? (see https://github.com/huggingface/datasets/blob/689447f8c86f777829a4db9ccc5d8133c12ec84c/src/datasets/load.py#L113-L134)

Anyway feel free to update datasets and try again just in case

clefourrier · 2024-07-08T11:12:28Z

No problem for the transfer if needed

Pommel4711 · 2024-07-11T08:08:03Z

I coppied the dataset code from this url ans now i get this error

(lighteval) D:\Arbeit\AIUI\RAG Telecom Dataset\lighteval>python run_evals_accelerate.py ^  --model_args "pretrained=DiscoResearch/DiscoLM_German_7b_v1" ^  --tasks "./examples/tasks/all_german_rag_evals.txt" ^  --override_batch_size 1 ^  --use_chat_template ^  --custom_tasks "community_tasks/german_rag_evals.py" ^  --output_dir "./evals/"
Traceback (most recent call last):
  File "D:\Arbeit\AIUI\RAG Telecom Dataset\lighteval\run_evals_accelerate.py", line 30, in <module>
    from lighteval.main_accelerate import CACHE_DIR, main
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\main_accelerate.py", line 31, in <module>
    from lighteval.evaluator import evaluate, make_results_table
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\evaluator.py", line 32, in <module>
    from lighteval.logging.evaluation_tracker import EvaluationTracker
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\logging\evaluation_tracker.py", line 32, in <module>
    from datasets import Dataset, load_dataset
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\datasets\__init__.py", line 26, in <module>
    from .inspect import (
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\datasets\inspect.py", line 32, in <module>
    from .load import (
ImportError: cannot import name 'metric_module_factory' from 'datasets.load' (C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\datasets\load.py)

clefourrier · 2024-07-11T08:18:00Z

Hi @Pommel4711 ,
Did you try to update datasets first as @lhoestq suggested?

Pommel4711 · 2024-07-11T08:21:57Z

Hi @Pommel4711 , Did you try to update datasets first as @lhoestq suggested?

Yes, I did update datasets as @lhoestq suggested.

OSes that don't support SIGALRM are supported thanks to a try/except - not sure how you managed to get the error related to SIGALRM ? (see https://github.com/huggingface/datasets/blob/689447f8c86f777829a4db9ccc5d8133c12ec84c/src/datasets/load.py#L113-L134)

Anyway feel free to update datasets and try again just in case

Despite updating the records I get a new error. Any further suggestions would be greatly appreciated.

Thank you!

clefourrier · 2024-07-11T08:24:23Z

Just to be sure, how did you update the package, and what is the current version you are running?

Pommel4711 · 2024-07-11T08:48:24Z

Issue with `lighteval` Evaluation Script

Description

I completely removed the Conda environment lighteval and updated the repository using the following command:

git pull
git checkout main

Checked out the main branch (commit ID = 4651531).

Then, I reinstalled the environment as follows:

conda create -n lighteval python=3.10 && conda activate lighteval
pip install .
pip install '.[accelerate,quantization,adapters]'

After that, I ran the evaluation script:

python run_evals_accelerate.py ^
  --model_args "pretrained=DiscoResearch/DiscoLM_German_7b_v1" ^
  --tasks "./examples/tasks/all_german_rag_evals.txt" ^
  --override_batch_size 1 ^
  --use_chat_template ^
  --custom_tasks "community_tasks/german_rag_evals.py" ^
  --output_dir "./evals/"

I encountered the following error:

File "D:\Arbeit\AIUI\RAG Telecom Dataset\lighteval\run_evals_accelerate.py", line 30, in <module>
    from lighteval.main_accelerate import CACHE_DIR, main
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\main_accelerate.py", line 31, in <module>
    from lighteval.evaluator import evaluate, make_results_table
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\evaluator.py", line 32, in <module>
    from lighteval.logging.evaluation_tracker import EvaluationTracker
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\logging\evaluation_tracker.py", line 37, in <module>
    from lighteval.logging.info_loggers import (
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\logging\info_loggers.py", line 34, in <module>
    from lighteval.metrics import MetricCategory
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\metrics\__init__.py", line 25, in <module>
    from lighteval.metrics.metrics import MetricCategory, Metrics
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\metrics\metrics.py", line 75, in <module>
    class Metrics(Enum):
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\metrics\metrics.py", line 235, in Metrics
    sample_level_fn=JudgeLLM(
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\metrics\metrics_sample.py", line 634, in __init__
    self.judge = JudgeOpenAI(
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\metrics\llm_as_judge.py", line 80, in __init__
    with open(templates_path, "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\apps\\entwicklungsumgebung\\anaconda3\\envs\\lighteval\\lib\\site-packages\\lighteval\\metrics\\judge_prompts.jsonl'

To resolve this, I downloaded judge_prompts.jsonl from this link and placed it in the directory where the error occurred.

I ran the script again, which resulted in the following output:

INFO:absl:Using default tokenizer.
INFO:absl:Using default tokenizer.
INFO:absl:Using default tokenizer.
INFO:absl:Using default tokenizer.
INFO:absl:Using default tokenizer.
WARNING:bitsandbytes.cextension:The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
WARNING:lighteval.logging.hierarchical_logger:main: (0, Namespace(model_config_path=None, model_args='pretrained=DiscoResearch/DiscoLM_German_7b_v1', max_samples=None, override_batch_size=1, job_id='', output_dir='./evals/', push_results_to_hub=False, save_details=False, push_details_to_hub=False, push_results_to_tensorboard=False, public_run=False, cache_dir=None, results_org=None, use_chat_template=True, system_prompt=None, dataset_loading_processes=1, custom_tasks='community_tasks/german_rag_evals.py', tasks='./examples/tasks/all_german_rag_evals.txt', num_fewshot_seeds=1)),  {
WARNING:lighteval.logging.hierarchical_logger:  Test all gather {
WARNING:lighteval.logging.hierarchical_logger:    Test gather tensor
WARNING:lighteval.logging.hierarchical_logger:    gathered_tensor tensor([0]), should be [0]
WARNING:lighteval.logging.hierarchical_logger:  } [0:00:00.006101]
WARNING:lighteval.logging.hierarchical_logger:  Creating model configuration {
WARNING:lighteval.logging.hierarchical_logger:  } [0:00:00]
WARNING:lighteval.logging.hierarchical_logger:  Model loading {
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
WARNING:lighteval.logging.hierarchical_logger:    Tokenizer truncation and padding size set to the left side.
WARNING:lighteval.logging.hierarchical_logger:    We are not in a distributed setting. Setting model_parallel to False.
WARNING:lighteval.logging.hierarchical_logger:    Model parallel was set to False, max memory set to None and device map to None
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [01:01<00:00, 20.60s/it]
WARNING:lighteval.logging.hierarchical_logger:    Using Data Parallelism, putting model on device cpu
WARNING:lighteval.logging.hierarchical_logger:    Model info: ModelInfo(model_name='DiscoResearch/DiscoLM_German_7b_v1', model_sha='560f972f9f735fc9289584b3aa8d75d0e539c44e', model_dtype='torch.bfloat16', model_size='13.49 GB')
WARNING:lighteval.logging.hierarchical_logger:  } [0:01:04.212504]
WARNING:lighteval.logging.hierarchical_logger:  Tasks loading {
WARNING:lighteval.logging.hierarchical_logger:  } [0:00:00.061989]
WARNING:lighteval.logging.hierarchical_logger:} [0:01:04.289455]
Traceback (most recent call last):
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\datasets\load.py", line 117, in resolve_trust_remote_code
    signal.signal(signal.SIGALRM, _raise_timeout_error)
AttributeError: module 'signal' has no attribute 'SIGALRM'. Did you mean: 'SIGABRT'?

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\Arbeit\AIUI\RAG Telecom Dataset\lighteval\run_evals_accelerate.py", line 89, in <module>
    main(args)
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\logging\hierarchical_logger.py", line 166, in wrapper
    return fn(*args, **kwargs)
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\main_accelerate.py", line 91, in main
    task_dict = Registry(cache_dir=env_config.cache_dir).get_task_dict(
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\tasks\registry.py", line 133, in get_task_dict
    custom_tasks_module.append(create_custom_tasks_module(custom_tasks=custom_tasks))
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\tasks\registry.py", line 168, in create_custom_tasks_module
    dataset_module = dataset_module_factory(str(custom_tasks))
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\datasets\load.py", line 1814, in dataset_module_factory
    ).get_module()
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\datasets\load.py", line 962, in get_module
    trust_remote_code = resolve_trust_remote_code(self.trust_remote_code, self.name)
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\datasets\load.py", line 133, in resolve_trust_remote_code
    raise ValueError(
ValueError: The repository for german_rag_evals contains custom code which must be executed to correctly load the dataset. You can inspect the repository content at https://hf.co/datasets/german_rag_evals.
Please pass the argument trust_remote_code=True to allow custom code to be run.

I deleted the dataset and replaced it with this version.

Upon running the script again, I encountered this error:

File "D:\Arbeit\AIUI\RAG Telecom Dataset\lighteval\run_evals_accelerate.py", line 30, in <module>
    from lighteval.main_accelerate import CACHE_DIR, main
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\main_accelerate.py", line 31, in <module>
    from lighteval.evaluator import evaluate, make_results_table
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\evaluator.py", line 32, in <module>
    from lighteval.logging.evaluation_tracker import EvaluationTracker
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\logging\evaluation_tracker.py", line 32, in <module>
    from datasets import Dataset, load_dataset
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\datasets\__init__.py", line 26, in <module>
    from .inspect import (
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\datasets\inspect.py", line 32, in <module>
    from .load import (
ImportError: cannot import name 'metric_module_factory' from 'datasets.load' (C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\datasets\load.py)

This clearly outlines the steps you took, the errors you encountered, and the troubleshooting steps i followed.

clefourrier · 2024-07-11T08:50:38Z

Thanks a lot for the detailed steps!
I think you should instead just do pip install -U datasets to upgrade datasets instead of manually editing files.

Pommel4711 · 2024-07-11T09:52:05Z

I tried running pip install -U datasets to upgrade datasets as you suggested, instead of manually editing the files. Unfortunately, this error still persists.

ImportError: cannot import name 'metric_module_factory' from 'datasets.load' (C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\datasets\load.py)

Do you have any other suggestions on how to resolve this issue?

Thank you!

clefourrier · 2024-07-11T10:19:20Z

cc @lhoestq this sounds like a datasets issue, you can transfer the issue to your lib if needed :)

NathanHB · 2024-07-11T10:29:04Z

I was unable to reproduce the issue even following the steps. I think it is indeed a datasets issue. I am however going to fix the missing file issue :)

Pommel4711 · 2024-07-11T11:22:24Z

Maybe i found the problem with the dataset.

I followed the steps mentioned in this comment to resolve the issue without deleting the file C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\datasets\load.py and replacing it with the version from this link.

Instead, I tried upgrading the datasets library using the following command:

pip install -U datasets

However, after the upgrade, I noticed that the load.py file remains unchanged and is not the same as the one from this link.

But than i remain with this error

(lighteval) D:\Arbeit\AIUI\RAG Telecom Dataset\lighteval>python run_evals_accelerate.py ^  --model_args "pretrained=DiscoResearch/DiscoLM_German_7b_v1" ^  --tasks "./examples/tasks/all_german_rag_evals.txt" ^  --override_batch_size 1 ^  --use_chat_template ^  --custom_tasks "community_tasks/german_rag_evals.py" ^  --output_dir "./evals/"
Using either accelerate or text-generation to run this script is advised.
main: (0, Namespace(model_config_path=None, model_args='pretrained=DiscoResearch/DiscoLM_German_7b_v1', max_samples=None, override_batch_size=1, job_id='', output_dir='./evals/', push_results_to_hub=False, save_details=False, push_details_to_hub=False, push_results_to_tensorboard=False, public_run=False, cache_dir=None, results_org=None, use_chat_template=True, system_prompt=None, dataset_loading_processes=1, custom_tasks='community_tasks/german_rag_evals.py', tasks='./examples/tasks/all_german_rag_evals.txt', num_fewshot_seeds=1)),  {
  Test all gather {
    Not running in a parallel setup, nothing to test
  } [0:00:00.001000]
  Creating model configuration {
  } [0:00:00]
  Model loading {
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
    Tokenizer truncation and padding size set to the left side.
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:21<00:00,  7.09s/it]
    Using Data Parallelism, putting model on device cpu
    Model info: ModelInfo(model_name='DiscoResearch/DiscoLM_German_7b_v1', model_sha='560f972f9f735fc9289584b3aa8d75d0e539c44e', model_dtype='torch.bfloat16', model_size=-1)
  } [0:00:23.565683]
  Tasks loading {
  } [0:00:00.061002]
} [0:00:23.641685]
Traceback (most recent call last):
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\datasets\load.py", line 117, in resolve_trust_remote_code
    signal.signal(signal.SIGALRM, _raise_timeout_error)
AttributeError: module 'signal' has no attribute 'SIGALRM'. Did you mean: 'SIGABRT'?

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\Arbeit\AIUI\RAG Telecom Dataset\lighteval\run_evals_accelerate.py", line 89, in <module>
    main(args)
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\logging\hierarchical_logger.py", line 166, in wrapper
    return fn(*args, **kwargs)
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\main_accelerate.py", line 91, in main
    task_dict = Registry(cache_dir=env_config.cache_dir).get_task_dict(
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\tasks\registry.py", line 133, in get_task_dict
    custom_tasks_module.append(create_custom_tasks_module(custom_tasks=custom_tasks))
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\lighteval\tasks\registry.py", line 168, in create_custom_tasks_module
    dataset_module = dataset_module_factory(str(custom_tasks))
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\datasets\load.py", line 1814, in dataset_module_factory
    ).get_module()
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\datasets\load.py", line 962, in get_module
    trust_remote_code = resolve_trust_remote_code(self.trust_remote_code, self.name)
  File "C:\apps\entwicklungsumgebung\anaconda3\envs\lighteval\lib\site-packages\datasets\load.py", line 133, in resolve_trust_remote_code
    raise ValueError(
ValueError: The repository for german_rag_evals contains custom code which must be executed to correctly load the dataset. You can inspect the repository content at https://hf.co/datasets/german_rag_evals.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.

nouf01 · 2024-08-19T10:12:33Z

Are you trying to run evaluation in offline mode? I got the same error but I am trying offline and I have replace HF links with local location but same trust_remote_code error keeps arising.

Pommel4711 · 2024-08-21T11:32:00Z

Are you trying to run evaluation in offline mode? I got the same error but I am trying offline and I have replace HF links with local location but same trust_remote_code error keeps arising.

I'm running this always with internet connection. But i don't know the problem. I switched to Linux and it worked

PhilipMay · 2024-08-23T10:36:10Z

@Pommel4711 now I also have the same issue. I am on linux. So this should not be the root cause of the problem.

PhilipMay · 2024-08-23T10:50:41Z

@Pommel4711 I found a solution that works for me. See here: #278

It is by adding export HF_DATASETS_TRUST_REMOTE_CODE=TRUE

But this should not be required. IMO this should be considered as a bug in lighteval.

lhoestq · 2024-08-23T12:10:50Z

can you try uninstalling and reinstalling datasets?

PhilipMay · 2024-08-23T13:36:48Z

can you try uninstalling and reinstalling datasets?

You mean a pip install -U datasets might not be enough?
@lhoestq

lhoestq · 2024-08-23T17:01:08Z

I double checked and actually the 'SIGALRM' error is not important (just showing for windows users in addition to the trust_remote_code) error which is the actual error.

Anyway there seems to be a dataset called german_rag_evals is a dataset based on a python script that requires remote code to be executed. It is required to pass trust_remote_code=True (or via the environment variable) to access it.

I couldn't find this dataset on HF though, is it a local dataset of yours ?

lhoestq · 2024-08-23T17:02:44Z

Ah it's community_tasks/german_rag_evals.py apparently ? Well maybe you should point to a dataset on HF with data e.g. in parquet files instead. (and remove this script from lighteval ?)

PhilipMay · 2024-08-23T17:09:04Z

Ah it's community_tasks/german_rag_evals.py apparently ? Well maybe you should point to a dataset on HF with data e.g. in parquet files instead. (and remove this script from lighteval ?)

I think this is not how lighteval is supposed to work.
What do you think @clefourrier ? What I did is written here: #278

lhoestq · 2024-08-23T17:12:24Z

german_rag_evals.py is not a dataset script actually, datasets can't read it.

So it looks like lighteval uses datasets' dataset_module_factory() function to open this file, maybe lighteval should have its own function to do that

PhilipMay · 2024-08-23T17:17:28Z

german_rag_evals.py is not a dataset script actually, datasets can't read it.

So it looks like lighteval uses datasets' dataset_module_factory() function to open this file, maybe lighteval should have its own function to do that

This may be the case and may be the cause of this issue.
@clefourrier

PhilipMay · 2024-09-11T14:18:40Z

@NathanHB we have new insights into this issue - see comments from me above.
Can you please have a look?

clefourrier · 2024-09-14T09:38:40Z

So it looks like lighteval uses datasets' dataset_module_factory() function to open this file, maybe lighteval should have its own function to do that

Interesting, I'll take a look this week

Pommel4711 closed this as completed Jul 11, 2024

Pommel4711 reopened this Jul 11, 2024

clefourrier assigned NathanHB Jul 17, 2024

PhilipMay mentioned this issue Aug 23, 2024

[BUG] Can not load deutsche-telekom/Ger-RAG-eval dataset. #278

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataset loading issue for german_rag_evals on Windows #211

Dataset loading issue for german_rag_evals on Windows #211

Pommel4711 commented Jul 4, 2024

clefourrier commented Jul 4, 2024

PhilipMay commented Jul 4, 2024

PhilipMay commented Jul 4, 2024

clefourrier commented Jul 4, 2024

PhilipMay commented Jul 4, 2024

Pommel4711 commented Jul 4, 2024 •

edited

Loading

clefourrier commented Jul 5, 2024

PhilipMay commented Jul 5, 2024 •

edited

Loading

lhoestq commented Jul 8, 2024 •

edited

Loading

clefourrier commented Jul 8, 2024

Pommel4711 commented Jul 11, 2024

clefourrier commented Jul 11, 2024

Pommel4711 commented Jul 11, 2024

clefourrier commented Jul 11, 2024

Pommel4711 commented Jul 11, 2024

clefourrier commented Jul 11, 2024

Pommel4711 commented Jul 11, 2024

clefourrier commented Jul 11, 2024

NathanHB commented Jul 11, 2024

Pommel4711 commented Jul 11, 2024

nouf01 commented Aug 19, 2024

Pommel4711 commented Aug 21, 2024 •

edited

Loading

PhilipMay commented Aug 23, 2024

PhilipMay commented Aug 23, 2024

lhoestq commented Aug 23, 2024

PhilipMay commented Aug 23, 2024 •

edited

Loading

lhoestq commented Aug 23, 2024

lhoestq commented Aug 23, 2024 •

edited

Loading

PhilipMay commented Aug 23, 2024 •

edited

Loading

lhoestq commented Aug 23, 2024

PhilipMay commented Aug 23, 2024 •

edited

Loading

PhilipMay commented Sep 11, 2024

clefourrier commented Sep 14, 2024

Dataset loading issue for german_rag_evals on Windows #211

Dataset loading issue for german_rag_evals on Windows #211

Comments

Pommel4711 commented Jul 4, 2024

clefourrier commented Jul 4, 2024

PhilipMay commented Jul 4, 2024

PhilipMay commented Jul 4, 2024

clefourrier commented Jul 4, 2024

PhilipMay commented Jul 4, 2024

Pommel4711 commented Jul 4, 2024 • edited Loading

clefourrier commented Jul 5, 2024

PhilipMay commented Jul 5, 2024 • edited Loading

lhoestq commented Jul 8, 2024 • edited Loading

clefourrier commented Jul 8, 2024

Pommel4711 commented Jul 11, 2024

clefourrier commented Jul 11, 2024

Pommel4711 commented Jul 11, 2024

clefourrier commented Jul 11, 2024

Pommel4711 commented Jul 11, 2024

Issue with lighteval Evaluation Script

Description

clefourrier commented Jul 11, 2024

Pommel4711 commented Jul 11, 2024

clefourrier commented Jul 11, 2024

NathanHB commented Jul 11, 2024

Pommel4711 commented Jul 11, 2024

Maybe i found the problem with the dataset.

nouf01 commented Aug 19, 2024

Pommel4711 commented Aug 21, 2024 • edited Loading

PhilipMay commented Aug 23, 2024

PhilipMay commented Aug 23, 2024

lhoestq commented Aug 23, 2024

PhilipMay commented Aug 23, 2024 • edited Loading

lhoestq commented Aug 23, 2024

lhoestq commented Aug 23, 2024 • edited Loading

PhilipMay commented Aug 23, 2024 • edited Loading

lhoestq commented Aug 23, 2024

PhilipMay commented Aug 23, 2024 • edited Loading

PhilipMay commented Sep 11, 2024

clefourrier commented Sep 14, 2024

Pommel4711 commented Jul 4, 2024 •

edited

Loading

PhilipMay commented Jul 5, 2024 •

edited

Loading

lhoestq commented Jul 8, 2024 •

edited

Loading

Issue with `lighteval` Evaluation Script

Pommel4711 commented Aug 21, 2024 •

edited

Loading

PhilipMay commented Aug 23, 2024 •

edited

Loading

lhoestq commented Aug 23, 2024 •

edited

Loading

PhilipMay commented Aug 23, 2024 •

edited

Loading

PhilipMay commented Aug 23, 2024 •

edited

Loading