Fix tuple object error #1354

SupreetSinghPalne · 2024-09-23T23:22:27Z

What does this PR do?

Summary- (you can reproduce it from here)

Issue is not seen on 1 HPU.
Issue is seen with OptimumHabana v1.13 + with 8 HPU
Issue not seen with OptimumHabana v1.12 with 8 HPU

Replicate:

Reserve Gaudi2 with driver 1.17-495
docker pull vault.habana.ai/gaudi-docker/1.17.0/ubuntu22.04/habanalabs/pytorch-installer-2.3.1:1.17.0-495
git clone GitHub - huggingface/optimum-habana: Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)
git checkout v1.13.2

docker run -it --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --rm --cap-add=sys_nice --net=host --ipc=host -v $PWD:/root -v $PWD/data:/data --workdir=/root/ vault.habana.ai/gaudi-docker/1.17.0/ubuntu22.04/habanalabs/pytorch-installer-2.3.1:1.17.0-495

export HTTPS_PROXY=http://proxy-dmz.intel.com:912/

cd optimum-habana
python -m pip install .

cd examples/text-generation/
python -m pip install -r ./requirements.txt
python -m pip install -r ./requirements_lm_eval.txt
python -m pip install git+https://github.com/HabanaAI/[email protected] your Github account
pip install datasets==2.19.2
huggingface-cli login --token $HF_TOKEN

QUANT_CONFIG=./quantization_config/maxabs_measure.json python ../gaudi_spawn.py
--use_deepspeed --world_size 8 run_generation.py
--model_name_or_path bigcode/starcoder2-15b
--attn_softmax_bf16
--use_hpu_graphs
--trust_remote_code
--trim_logits
--use_kv_cache
--bucket_size 128
--bucket_internal
--use_flash_attention
--flash_attention_recompute
--max_new_tokens 128
--batch_size 1
--bf16

Error: [rank0]: TypeError: 'tuple' object does not support item assignment

regisss · 2024-09-24T09:17:43Z

@SupreetSinghPalne Please add a description of this issue and a code snippet to reproduce it if possible. I cannot access https://habana.atlassian.net/browse/HS-3349.

pk1d3v · 2024-09-24T12:17:20Z

@SupreetSinghPalne, this is common code which is shared across many models.
So, please make sure you are not breaking any other models that rely on this code. And also I'm wondering, why other models works fine without this change.

SupreetSinghPalne · 2024-09-24T16:36:35Z

@SupreetSinghPalne Please add a description of this issue and a code snippet to reproduce it if possible. I cannot access

Summary-

Issue is not seen on 1 HPU.
Issue is seen with OptimumHabana v1.13 + with 8 HPU
Issue not seen with OptimumHabana v1.12 with 8 HPU

Replicate:

Reserve Gaudi2 with driver 1.17-495
docker pull vault.habana.ai/gaudi-docker/1.17.0/ubuntu22.04/habanalabs/pytorch-installer-2.3.1:1.17.0-495
git clone GitHub - huggingface/optimum-habana: Easy and lightning fast training of 🤗 Transformers on Habana Gaudi processor (HPU)
git checkout v1.13.2

docker run -it --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --rm --cap-add=sys_nice --net=host --ipc=host -v $PWD:/root -v $PWD/data:/data --workdir=/root/ vault.habana.ai/gaudi-docker/1.17.0/ubuntu22.04/habanalabs/pytorch-installer-2.3.1:1.17.0-495

export HTTPS_PROXY=http://proxy-dmz.intel.com:912/

cd optimum-habana
python -m pip install .

cd examples/text-generation/
python -m pip install -r ./requirements.txt
python -m pip install -r ./requirements_lm_eval.txt
python -m pip install git+https://github.com/HabanaAI/[email protected] your Github account
pip install datasets==2.19.2
huggingface-cli login --token $HF_TOKEN

QUANT_CONFIG=./quantization_config/maxabs_measure.json python ../gaudi_spawn.py
--use_deepspeed --world_size 8 run_generation.py
--model_name_or_path bigcode/starcoder2-15b
--attn_softmax_bf16
--use_hpu_graphs
--trust_remote_code
--trim_logits
--use_kv_cache
--bucket_size 128
--bucket_internal
--use_flash_attention
--flash_attention_recompute
--max_new_tokens 128
--batch_size 1
--bf16

Error: [rank0]: TypeError: 'tuple' object does not support item assignment

mgonchar · 2024-09-24T19:50:04Z

hi @SupreetSinghPalne I think this model doesn't support internal bucketing.
Often model should be patched to support it, see for example this PR #1137

So I'd say you need similar changes here https://github.com/huggingface/optimum-habana/blob/main/optimum/habana/transformers/models/starcoder2/modeling_starcoder2.py#L286 but not in the common code.

Also, why do you use internal bucketing for this model? It supports another key named reuse_cache with similar effect

SupreetSinghPalne · 2024-09-25T18:48:23Z

@SupreetSinghPalne, this is common code which is shared across many models. So, please make sure you are not breaking any other models that rely on this code. And also I'm wondering, why other models works fine without this change.

I checked with other models so they work, so i will not change the common code but need to change the modeling_starcoder2.py file, which fixes this problem, thank you for your review.

SupreetSinghPalne · 2024-09-25T19:08:43Z

hi @SupreetSinghPalne I think this model doesn't support internal bucketing. Often model should be patched to support it, see for example this PR #1137

So I'd say you need similar changes here https://github.com/huggingface/optimum-habana/blob/main/optimum/habana/transformers/models/starcoder2/modeling_starcoder2.py#L286 but not in the common code.

Also, why do you use internal bucketing for this model? It supports another key named reuse_cache with similar effect

Right, common code was not supposed to be changed. So changed the modeling_starcoder2.py file accordingly and it fixes the error of tuple. I have made the changes. Thank you for the review.

SupreetSinghPalne · 2024-10-07T17:09:35Z

I have updated the description @regisss

SupreetSinghPalne · 2024-10-11T17:46:59Z

@regisss can you take a look and help this PR merge.

SupreetSinghPalne requested review from ssarkar2, bhargaveede and vivekgoe as code owners September 23, 2024 23:22

Fix tuple object error

41d5ffc

SupreetSinghPalne force-pushed the main branch from 1c6657a to 41d5ffc Compare September 25, 2024 19:07

SupreetSinghPalne requested a review from regisss as a code owner September 25, 2024 19:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix tuple object error #1354

Fix tuple object error #1354

SupreetSinghPalne commented Sep 23, 2024 •

edited

Loading

regisss commented Sep 24, 2024

pk1d3v commented Sep 24, 2024

SupreetSinghPalne commented Sep 24, 2024 •

edited

Loading

mgonchar commented Sep 24, 2024

SupreetSinghPalne commented Sep 25, 2024 •

edited

Loading

SupreetSinghPalne commented Sep 25, 2024

SupreetSinghPalne commented Oct 7, 2024

SupreetSinghPalne commented Oct 11, 2024

Fix tuple object error #1354

Are you sure you want to change the base?

Fix tuple object error #1354

Conversation

SupreetSinghPalne commented Sep 23, 2024 • edited Loading

What does this PR do?

regisss commented Sep 24, 2024

pk1d3v commented Sep 24, 2024

SupreetSinghPalne commented Sep 24, 2024 • edited Loading

mgonchar commented Sep 24, 2024

SupreetSinghPalne commented Sep 25, 2024 • edited Loading

SupreetSinghPalne commented Sep 25, 2024

SupreetSinghPalne commented Oct 7, 2024

SupreetSinghPalne commented Oct 11, 2024

SupreetSinghPalne commented Sep 23, 2024 •

edited

Loading

SupreetSinghPalne commented Sep 24, 2024 •

edited

Loading

SupreetSinghPalne commented Sep 25, 2024 •

edited

Loading