Skip to content

TypeError when load_dataset on nvcr.io/nvidia/nemo:25.11 #15324

@pcompieta

Description

@pcompieta

Describe the bug

Inside the nvcr.io/nvidia/nemo:25.11 container image, load_dataset("rajpurkar/squad") is failing with below error:

TypeError: must be called with a dataclass type or instance

Steps/Code to reproduce bug

  1. Shell into the container docker run --rm -it nvcr.io/nvidia/nemo:25.11
  2. load dataset python -c 'from datasets import load_dataset; load_dataset("rajpurkar/squad")'

Expected behavior

Have the dataset downloaded into the proper folder.

Environment overview (please complete the following information)

  • Environment location: Docker
  • Method of install: apt-get
  • If method of install is [Docker], provide docker pull & docker run commands used: see above "repro steps"

Additional context

Docker version 29.1.4

NVIDIA H100
NVIDIA-SMI 590.48.01
Driver Version: 590.48.01
CUDA Version: 13.1

Complete error attached.

load_dataset_error.txt

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions