loading big models into memory #3153

werruww · 2024-10-10T02:31:11Z

System Info

colab
t4


https://huggingface.co/docs/accelerate/concept_guides/
https://huggingface.co/docs/accelerate/concept_guides/big_model_inference

If I have a single 16 GB Vega and a processor, how do I run a larger model of Vega on the Vega and the processor so that I can benefit from the Vega acceleration? Are the codes that I ran correct or can they be modified to achieve good results?

Information

The official example scripts
My own modified scripts

Tasks

One of the scripts in the examples/ folder of Accelerate or an officially supported no_trainer script in the examples folder of the transformers repo (such as run_no_trainer_glue.py)
My own task or dataset (give details below)

Reproduction

!git clone https://github.com/karpathy/minGPT.git
!pip install minGPT/
!pip install huggingface_hub

!pip install accelerate --upgrade

from huggingface_hub import snapshot_download
checkpoint = "marcsun13/gpt2-xl-linear-sharded"
weights_location = snapshot_download(repo_id=checkpoint)

from accelerate import init_empty_weights
from mingpt.model import GPT

model_config = GPT.get_default_config()
model_config.model_type = 'gpt2-xl'
model_config.vocab_size = 50257
model_config.block_size = 1024

with init_empty_weights():
    model = GPT(model_config)

from accelerate import load_checkpoint_and_dispatch

model = load_checkpoint_and_dispatch(
    model, checkpoint=weights_location, device_map="auto", no_split_module_classes=['Block']
)

from mingpt.bpe import BPETokenizer
tokenizer = BPETokenizer()
inputs = tokenizer("who is python?").to(0)

# Change x1 to inputs
outputs = model.generate(inputs, max_new_tokens=512, do_sample=False)[0]
tokenizer.decode(outputs.cpu().squeeze())

Expected behavior

code run good
python is a popular open source Python library for data analysis. It is used by many Python developers to perform data analysis tasks.
Python is a very popular programming language. It is used by many people to do many things.
Python is a very popular programming language. It is used by many people to do many things.
Python is a very popular programming language. It is used by many people to do many things.
Python is a very popular programming language. It is used by many people to do many things.
Python is a very popular programming language. It is used by many people to do many things.
Python is a very popular programming language. It is used by many people to do many things.
Python is a very popular programming language. It is used by many people to do many things.
Python is a very popular programming language. It is used by many people to do many things.
Python is a very popular programming language. It is used by many people to do many things.
Python is a very popular programming language. It is used by many people to do many things.
Python is a very popular programming language. It is used by many people to do many things.
Python is a very popular programming language. It is used by many people to do many things.
Python is a very popular programming language. It is used by many people to do many things.
Python is a very popular programming language. It is used by many people to do many things.
Python is a very popular programming language. It is used by many people to do many things.
Python is a very popular programming language. It is used by many people to do many things.
Python is a very popular programming language. It is used by many people to do many things.
Python is a very popular programming language. It is used by many people to do many things.
Python is a very popular programming language. It is used by many people to do many things.
Python is a very popular programming language. It is used by many people to do many things.
Python is a very popular programming language. It is used by many people to do many things.
Python is a very popular programming language. It is used by many people to do many things.
Python is a very popular programming language. It is used by many people to do many things.
Python is a very popular programming language. It is used by many people to do many things.
Python is a very popular programming language

who is python?

I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure. I'm not sure.

The text was updated successfully, but these errors were encountered:

werruww · 2024-10-10T02:51:20Z

from huggingface_hub import snapshot_download
import torch
from accelerate import infer_auto_device_map
from transformers import AutoModelForCausalLM, AutoConfig

checkpoint = "marcsun13/gpt2-xl-linear-sharded"
weights_location = snapshot_download(repo_id=checkpoint)

# Instead of loading directly from checkpoint, use 'gpt2-xl' as base
# and load the sharded weights into it.
config = AutoConfig.from_pretrained("gpt2-xl")  # Load config for gpt2-xl

# Now load the model using the gpt2-xl configuration and downloaded sharded weights
model = AutoModelForCausalLM.from_pretrained(
    weights_location, config=config, torch_dtype=torch.float16, ignore_mismatched_sizes=True
)

# Now use the model object in infer_auto_device_map
device_map = infer_auto_device_map(
    model, max_memory={0: "10GiB", "cpu": "10GiB"}
)

from accelerate import init_empty_weights
from mingpt.model import GPT

model_config = GPT.get_default_config()
model_config.model_type = 'gpt2-xl'
model_config.vocab_size = 50257
model_config.block_size = 1024

with init_empty_weights():
    model = GPT(model_config)

from accelerate import load_checkpoint_and_dispatch

model = load_checkpoint_and_dispatch(
    model, checkpoint=weights_location, device_map="auto", no_split_module_classes=['Block']
)


model.hf_device_map

from mingpt.bpe import BPETokenizer

tokenizer = BPETokenizer()
inputs = tokenizer("Who is Napoleon Bonaparte?").to(0)

# Use 'inputs' instead of 'x1' here
outputs = model.generate(inputs, max_new_tokens=512, do_sample=False)[0]
tokenizer.decode(outputs.cpu().squeeze())




Fetching 9 files: 100%
 9/9 [00:00<00:00, 370.91it/s]
Loading checkpoint shards: 100%
 7/7 [00:01<00:00,  4.30it/s]
Some weights of GPT2LMHeadModel were not initialized from the model checkpoint at /root/.cache/huggingface/hub/models--marcsun13--gpt2-xl-linear-sharded/snapshots/aeb281f0cd2bfc947d4702b27aecd9194c322c7e and are newly initialized because the shapes did not match:
- transformer.h.0.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.0.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.0.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.1.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.1.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.1.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.2.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.2.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.2.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.3.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.3.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.3.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.4.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.4.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.4.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.5.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.10.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.10.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.10.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.11.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.11.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.11.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.12.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.12.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.12.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.5.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.5.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.6.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.6.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.6.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.7.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.7.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.7.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.8.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.8.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.8.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.9.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.9.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.9.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.13.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.13.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.13.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.14.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.14.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.14.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.15.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.15.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.15.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.16.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.16.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.16.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.17.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.17.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.17.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.18.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.18.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.18.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.19.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.19.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.19.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.20.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.20.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.20.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.21.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.21.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.21.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.22.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.22.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.22.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.23.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.23.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.23.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.24.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.24.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.24.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.25.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.25.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.25.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.26.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.26.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.26.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.27.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.27.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.27.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.28.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.28.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.28.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.29.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.29.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.29.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.30.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.30.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.30.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.31.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.31.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.31.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.32.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.32.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.32.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.33.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.33.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.33.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.34.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.34.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.34.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.35.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.35.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.35.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.36.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.36.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.36.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.37.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.37.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.37.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.38.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.38.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.38.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.39.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.39.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.39.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.40.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.40.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.40.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.41.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.41.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.41.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.42.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.42.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.42.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.43.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.43.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.43.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.44.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.44.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.44.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.45.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.45.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.45.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.46.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.46.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.46.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
- transformer.h.47.attn.c_attn.weight: found shape torch.Size([4800, 1600]) in the checkpoint and torch.Size([1600, 4800]) in the model instantiated
- transformer.h.47.mlp.c_fc.weight: found shape torch.Size([6400, 1600]) in the checkpoint and torch.Size([1600, 6400]) in the model instantiated
- transformer.h.47.mlp.c_proj.weight: found shape torch.Size([1600, 6400]) in the checkpoint and torch.Size([6400, 1600]) in the model instantiated
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
number of parameters: 1557.61M
Who is Napoleon Bonaparte?\n\nNapoleon Bonaparte was a French general who led the French army during the French Revolution. He was the first to use the term "Napoleon" to describe himself.\n\nWhat is the name of the French Revolution?\n\nThe French Revolution was a period of political and social upheaval in France that began in 1789. It was the first of the French revolutions, and was the first to be led by a man.\n\nWhat is the name of the French Revolution?\n\nThe French Revolution was a period of political and social upheaval in France that began in 1789. It was the first of the French revolutions, and was the first to be led by a man.\n\nWhat is the name of the French Revolution?\n\nThe French Revolution was a period of political and social upheaval in France that began in 1789. It was the first of the French revolutions, and was the first to be led by a man.\n\nWhat is the name of the French Revolution?\n\nThe French Revolution was a period of political and social upheaval in Fran

werruww · 2024-10-10T02:51:41Z

If I have a single 16 GB Vega and a processor, how do I run a larger model of Vega on the Vega and the processor so that I can benefit from the Vega acceleration? Are the codes that I ran correct or can they be modified to achieve good results?

werruww · 2024-10-10T02:53:58Z

What are the steps from a to z to run a model larger than the 16 GB Vega on the Vega and the processor? Starting from downloading the model, then creating an empty model, then placing the weights in it, then running it with a request or completing the text

muellerzr · 2024-10-10T17:58:10Z

@werruww please do not spam this with nearly the same result. It makes us think that this is an LLM instead of a real problem, and bloats our notifications as well

muellerzr · 2024-10-10T17:59:35Z

In general, do device_map="auto" and accelerate will fill your model how it can, and offload the rest to the CPU/hard drive and run the model from there.

werruww · 2024-10-10T19:14:31Z

from huggingface_hub import snapshot_download
checkpoint = "marcsun13/gpt2-xl-linear-sharded"
weights_location = snapshot_download(repo_id=checkpoint)

from accelerate import init_empty_weights
from mingpt.model import GPT

model_config = GPT.get_default_config()
model_config.model_type = 'gpt2-xl'
model_config.vocab_size = 50257
model_config.block_size = 1024

with init_empty_weights():
model = GPT(model_config)

from accelerate import load_checkpoint_and_dispatch

model = load_checkpoint_and_dispatch(
model, checkpoint=weights_location, device_map="auto", no_split_module_classes=['Block']
)

from mingpt.bpe import BPETokenizer
tokenizer = BPETokenizer()
inputs = tokenizer("who is python?").to(0)

Change x1 to inputs

outputs = model.generate(inputs, max_new_tokens=512, do_sample=False)[0]
tokenizer.decode(outputs.cpu().squeeze())

device_map="auto"
instead
?????????
model = load_checkpoint_and_dispatch(
model, checkpoint=weights_location, device_map="auto", no_split_module_classes=['Block']
)

This is the code what is the modification؟

werruww · 2024-10-10T19:16:20Z

clear
device_map = infer_auto_device_map(
model, max_memory={0: "10GiB", "cpu": "10GiB"}
)

and pot
device_map="auto"

werruww · 2024-10-10T19:18:38Z

If you allow me to write a complete code that I trust
I copied the codes from the site randomly
If it is possible to write a complete code that reassures me that it will build an empty form, fill it out, and run it on the Vega and then the processor, no matter the size of the form

werruww · 2024-10-10T19:32:29Z

If possible, a collab page tpu 24g
A model larger than 24 GB
To clarify things
Thank you

werruww · 2024-10-10T22:15:58Z

ValueError Traceback (most recent call last)
in <cell line: 3>()
1 from accelerate import load_checkpoint_and_dispatch
2
----> 3 model = load_checkpoint_and_dispatch(
4 model, checkpoint=weights_location, device_map="auto", no_split_module_classes=['Block']
5 )

2 frames
/usr/local/lib/python3.10/dist-packages/accelerate/utils/modeling.py in set_module_tensor_to_device(module, tensor_name, device, value, dtype, fp16_statistics, tied_params_map)
371 # In other cases, we want to make sure we're not loading checkpoints that do not match the config.
372 if old_value.shape != value.shape and param_cls.name != "Params4bit":
--> 373 raise ValueError(
374 f'Trying to set a tensor of shape {value.shape} in "{tensor_name}" (which has shape {old_value.shape}), this looks incorrect.'
375 )

ValueError: Trying to set a tensor of shape torch.Size([32768, 4096]) in "weight" (which has shape torch.Size([32768, 768])), this looks incorrect.

code

from huggingface_hub import snapshot_download
checkpoint = "mistralai/Mistral-7B-Instruct-v0.3"
weights_location = snapshot_download(repo_id=checkpoint)

from accelerate import init_empty_weights
from mingpt.model import GPT

model_config = GPT.get_default_config()
model_config.model_type = 'gpt2'
model_config.vocab_size = 32768
model_config.block_size = 768
model_config.hidden_size = 768
with init_empty_weights():
model = GPT(model_config)

from accelerate import init_empty_weights
from mingpt.model import GPT
model_config = GPT.get_default_config()
model_config.model_type = 'mistral'
model_config.vocab_size = 32000 # حجم المفردات لـ Mistral
model_config.block_size = 4096 # الحد الأقصى لطول السياق
model_config.n_layer = 32 # عدد الطبقات
model_config.n_head = 32 # عدد رؤوس الانتباه
model_config.n_embd = 4096 # حجم التضمين الخفي

with init_empty_weights():
model = GPT(model_config)

from accelerate import load_checkpoint_and_dispatch

model = load_checkpoint_and_dispatch(
model, checkpoint=weights_location, device_map="auto", no_split_module_classes=['Block']
)

!model.hf_device_map

from mingpt.bpe import BPETokenizer
tokenizer = BPETokenizer()
inputs = tokenizer("Who is Napoleon Bonaparte?").to(0)

Use 'inputs' instead of 'x1' for model generation

outputs = model.generate(inputs, max_new_tokens=1024, do_sample=False)[0]
tokenizer.decode(outputs.cpu().squeeze())

werruww · 2024-10-10T22:20:03Z

https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3/discussions/87

werruww · 2024-10-10T22:20:49Z

Extended vocabulary to 32768

https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3

werruww · 2024-10-10T22:22:14Z

I ran the code. on colab t4 12 ram

werruww · 2024-10-10T22:25:17Z

ValueError Traceback (most recent call last)
in <cell line: 3>()
1 from accelerate import load_checkpoint_and_dispatch
2
----> 3 model = load_checkpoint_and_dispatch(
4 model, checkpoint=weights_location, device_map="auto", no_split_module_classes=['Block']
5 )

2 frames
/usr/local/lib/python3.10/dist-packages/accelerate/utils/modeling.py in set_module_tensor_to_device(module, tensor_name, device, value, dtype, fp16_statistics, tied_params_map)
371 # In other cases, we want to make sure we're not loading checkpoints that do not match the config.
372 if old_value.shape != value.shape and param_cls.name != "Params4bit":
--> 373 raise ValueError(
374 f'Trying to set a tensor of shape {value.shape} in "{tensor_name}" (which has shape {old_value.shape}), this looks incorrect.'
375 )

ValueError: Trying to set a tensor of shape torch.Size([32768, 4096]) in "weight" (which has shape torch.Size([32768, 768])), this looks incorrect.

werruww · 2024-10-11T00:12:54Z

| { -- | -- | "architectures": [ | "MistralForCausalLM" | ], | "attention_dropout": 0.0, | "bos_token_id": 1, | "eos_token_id": 2, | "hidden_act": "silu", | "hidden_size": 4096, | "initializer_range": 0.02, | "intermediate_size": 14336, | "max_position_embeddings": 32768, | "model_type": "mistral", | "num_attention_heads": 32, | "num_hidden_layers": 32, | "num_key_value_heads": 8, | "rms_norm_eps": 1e-05, | "rope_theta": 1000000.0, | "sliding_window": null, | "tie_word_embeddings": false, | "torch_dtype": "bfloat16", | "transformers_version": "4.42.0.dev0", | "use_cache": true, | "vocab_size": 32768 | }

no config.block_size

werruww · 2024-10-11T00:13:49Z

https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3/discussions/88

werruww · 2024-10-11T01:25:34Z

from huggingface_hub import snapshot_download
checkpoint = "openai-community/gpt2"
weights_location = snapshot_download(repo_id=checkpoint)

import torch.nn as nn # import the torch.nn module and alias it as nn
from accelerate import init_empty_weights

with init_empty_weights():
model = nn.Sequential(*[nn.Linear(10000, 10000) for _ in range(1000)])

import torch
import torch.nn as nn
from huggingface_hub import snapshot_download
from accelerate import init_empty_weights, load_checkpoint_and_dispatch
from transformers import GPT2LMHeadModel

Download checkpoint weights

checkpoint = "openai-community/gpt2"
weights_location = snapshot_download(repo_id=checkpoint)

Initialize an empty model of the correct type, but load the weights immediately

instead of using init_empty_weights

with init_empty_weights(): # Remove this line

model = GPT2LMHeadModel.from_pretrained(checkpoint, torch_dtype=torch.float16)

Initialize the model and load weights directly

model = GPT2LMHeadModel.from_pretrained(checkpoint, torch_dtype=torch.float16)

Load the checkpoint weights into the model, dispatching to appropriate devices

Note: If you want to load specific weights from the checkpoint file,

you'll need to modify this part to load the state_dict explicitly.

model = load_checkpoint_and_dispatch(
model,
checkpoint=weights_location,
device_map="auto",
offload_folder="offload_folder", # Use a folder name, not "True"
no_split_module_classes=['Block']
)

import torch.nn as nn # import the torch.nn module and alias it as nn
from accelerate import init_empty_weights, load_checkpoint_and_dispatch
from transformers import AutoModelForCausalLM # Import AutoModelForCausalLM
from huggingface_hub import snapshot_download

Download the checkpoint

checkpoint = "openai-community/gpt2"
weights_location = snapshot_download(repo_id=checkpoint)

Option 1: Load the pre-trained GPT-2 model

Instead of creating a sequential model, use AutoModelForCausalLM to load GPT-2 directly

model = AutoModelForCausalLM.from_pretrained(checkpoint)

Option 2: Update the device_map to be compatible with the Sequential model.

NOTE: This assumes the checkpoint is compatible with your sequential model.

It is more likely that you will need to create a model compatible with your checkpoint.

device_map = {}

for i in range(1000):

device_map[f"{i}.weight"] = "cpu" # Map weights of each layer to CPU

device_map[f"{i}.bias"] = "cpu" # Map biases of each layer to CPU

Load the checkpoint and dispatch

model = load_checkpoint_and_dispatch(
model, checkpoint=weights_location, device_map="auto", offload_folder="True"
)

import torch
from tokenizers import ByteLevelBPETokenizer
from transformers import GPT2Tokenizer

Instantiate the GPT-2 tokenizer instead of ByteLevelBPETokenizer

tokenizer = GPT2Tokenizer.from_pretrained("gpt2")

Use the tokenizer

inputs = tokenizer("Hello, my name is", return_tensors="pt").input_ids.to("cpu")

outputs = model.generate(inputs, max_new_tokens=10, do_sample=False)[0]
decoded_output = tokenizer.decode(outputs.cpu().squeeze().tolist())
print(decoded_output)

/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py:1601: FutureWarning: clean_up_tokenization_spaces was not set. It will be set to True by default. This behavior will be depracted in transformers v4.45, and will be then set to False by default. For more details check this issue: huggingface/transformers#31884
warnings.warn(
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results.
Setting pad_token_id to eos_token_id:50256 for open-end generation.
Hello, my name is John. I'm a writer, and I'm

werruww · 2024-10-11T01:26:02Z

colab no t4 no tpu

werruww · 2024-10-11T01:29:25Z

How do I create a model without a family gpt and without minGPT

like mistral, phi3.5,lama3.1,qwen

werruww · 2024-10-14T21:22:50Z

https://github.com/werruww/run-prompt-with-accelerate

werruww changed the title ~~oading big models into memory~~ loading big models into memory Oct 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

loading big models into memory #3153

loading big models into memory #3153

werruww commented Oct 10, 2024 •

edited by muellerzr

Loading

werruww commented Oct 10, 2024 •

edited by muellerzr

Loading

werruww commented Oct 10, 2024

werruww commented Oct 10, 2024

muellerzr commented Oct 10, 2024

muellerzr commented Oct 10, 2024

werruww commented Oct 10, 2024

werruww commented Oct 10, 2024

werruww commented Oct 10, 2024

werruww commented Oct 10, 2024

werruww commented Oct 10, 2024

werruww commented Oct 10, 2024

werruww commented Oct 10, 2024

werruww commented Oct 10, 2024

werruww commented Oct 10, 2024

werruww commented Oct 11, 2024

werruww commented Oct 11, 2024

werruww commented Oct 11, 2024

werruww commented Oct 11, 2024

werruww commented Oct 11, 2024

werruww commented Oct 14, 2024

loading big models into memory #3153

loading big models into memory #3153

Comments

werruww commented Oct 10, 2024 • edited by muellerzr Loading

System Info

Information

Tasks

Reproduction

Expected behavior

werruww commented Oct 10, 2024 • edited by muellerzr Loading

werruww commented Oct 10, 2024

werruww commented Oct 10, 2024

muellerzr commented Oct 10, 2024

muellerzr commented Oct 10, 2024

werruww commented Oct 10, 2024

Change x1 to inputs

werruww commented Oct 10, 2024

werruww commented Oct 10, 2024

werruww commented Oct 10, 2024

werruww commented Oct 10, 2024

Use 'inputs' instead of 'x1' for model generation

werruww commented Oct 10, 2024

werruww commented Oct 10, 2024

werruww commented Oct 10, 2024

werruww commented Oct 10, 2024

werruww commented Oct 11, 2024

werruww commented Oct 11, 2024

werruww commented Oct 11, 2024

Download checkpoint weights

Initialize an empty model of the correct type, but load the weights immediately

instead of using init_empty_weights

with init_empty_weights(): # Remove this line

model = GPT2LMHeadModel.from_pretrained(checkpoint, torch_dtype=torch.float16)

Initialize the model and load weights directly

Load the checkpoint weights into the model, dispatching to appropriate devices

Note: If you want to load specific weights from the checkpoint file,

you'll need to modify this part to load the state_dict explicitly.

Download the checkpoint

Option 1: Load the pre-trained GPT-2 model

Instead of creating a sequential model, use AutoModelForCausalLM to load GPT-2 directly

Option 2: Update the device_map to be compatible with the Sequential model.

NOTE: This assumes the checkpoint is compatible with your sequential model.

It is more likely that you will need to create a model compatible with your checkpoint.

device_map = {}

for i in range(1000):

device_map[f"{i}.weight"] = "cpu" # Map weights of each layer to CPU

device_map[f"{i}.bias"] = "cpu" # Map biases of each layer to CPU

Load the checkpoint and dispatch

Instantiate the GPT-2 tokenizer instead of ByteLevelBPETokenizer

Use the tokenizer

werruww commented Oct 11, 2024

werruww commented Oct 11, 2024

werruww commented Oct 14, 2024

werruww commented Oct 10, 2024 •

edited by muellerzr

Loading

werruww commented Oct 10, 2024 •

edited by muellerzr

Loading