Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable FP8 in Mcore integration test on older GPUs #1357

Merged
merged 2 commits into from
Dec 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions qa/L1_pytorch_mcore_integration/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Megatron-LM
vocab.json
1 change: 1 addition & 0 deletions qa/L1_pytorch_mcore_integration/merges.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
#version: 0.2
22 changes: 18 additions & 4 deletions qa/L1_pytorch_mcore_integration/test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,27 @@ set -e
: ${TE_PATH:=/opt/transformerengine}
: ${MCORE_PATH:=${TE_PATH}/qa/L1_pytorch_mcore_integration/Megatron-LM}

# Check whether FP8 is supported
DEVICE_ARCH=$(nvidia-smi --query-gpu=compute_cap --format=csv,noheader | head -n 1 | sed 's/[^0-9]//g')
if [[ ${DEVICE_ARCH} -ge 89 ]]; then
WITH_FP8=1
fi

# Download Megatron-LM if needed
if [ ! -d "${MCORE_PATH}" ]; then
pushd $(dirname ${MCORE_PATH})
git clone -b core_r0.9.0 https://github.com/NVIDIA/Megatron-LM.git Megatron-LM
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess then we have to manually update this branch. Is this okay?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's an argument for just downloading the Megatron-LM main branch, but that is orthogonal to this PR.

popd
fi

# Create mock vocab
VOCAB_FILE=${TE_PATH}/qa/L1_pytorch_mcore_integration/vocab.json
printf "" > ${VOCAB_FILE}
printf "{" >> ${VOCAB_FILE}
printf "\"<|endoftext|>\": 0" >> ${VOCAB_FILE}
seq 1 4095 | awk '{ printf(", \"%d\": %d", $1, $1) }' >> ${VOCAB_FILE}
printf "}" >> ${VOCAB_FILE}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, that's enough to generate a vocab file!

Copy link
Collaborator Author

@timmoon10 timmoon10 Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is somewhat specific to the Mcore mock GPT dataset: https://github.com/NVIDIA/Megatron-LM/blob/bd677bfb13ac2f19deaa927adc6da6f9201d66aa/megatron/core/datasets/gpt_dataset.py#L693
The model figures out the number of embeddings based on the vocab size. However, the mock dataset just generates indices with arange (up to a max sequence length of 4096), so the vocab can be junk.


# Megatron-LM invocation
COMMAND="
NVTE_TORCH_COMPILE=0
Expand All @@ -40,17 +54,17 @@ ${MCORE_PATH}/pretrain_gpt.py
--hidden-size 128
--num-attention-heads 8
--seq-length 128
--max-position-embeddings 2048
--max-position-embeddings 128
--micro-batch-size 1
--global-batch-size 8
--train-iters 10
--eval-iters 10
--lr 1e-4
--mock-data
--vocab-file /data/gpt3/pile-cc1-cc2-shuf/bpe/gpt2-vocab.json
--merge-file /data/gpt3/pile-cc1-cc2-shuf/bpe/gpt2-merges.txt
--vocab-file ${VOCAB_FILE}
--merge-file ${TE_PATH}/qa/L1_pytorch_mcore_integration/merges.txt
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what's the purpose of this file, but is it okay for it to only have version info?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm treating merges.txt as a magic config for the tokenizer: huggingface/transformers#1083 (comment)

--transformer-impl transformer_engine
--fp8-format hybrid
${WITH_FP8:+--fp8-format hybrid}
"
COMMAND=$(echo "${COMMAND}" | tr '\n' ' ')

Expand Down
Loading