-
Notifications
You must be signed in to change notification settings - Fork 68
[QEff.finetune] Integrated test for HF Trainer #800
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
tchawada
wants to merge
38
commits into
quic:ft_experimental
Choose a base branch
from
tchawada:ft_integrated
base: ft_experimental
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+710
−65
Open
Changes from all commits
Commits
Show all changes
38 commits
Select commit
Hold shift + click to select a range
f81ef6e
General disagg fix for prefill-only model (#698)
ochougul c57392d
Adding Vae Decoder in Wan (#688)
mohiso22 75367b1
Evaluating the values of CCL lists for different scenarios (#710)
vjanfaza 1e63710
Updating 2-layer instruction for Wan (#715)
tv-karthikeya 1ef9935
Updated finetune docs for MULTI NODE Training (#717)
quic-akuruvil c76d5ea
Adding support for multi-node DDP training (#708)
smedhe 7a39933
Updating MDP partition config: prioritizing dump over load (#720)
asmigosw 08bce2c
Updated docs (#722)
quic-akuruvil 8b00c1b
HOTFIX: changes in alpaca and grammar dataset utils (#724)
smedhe b074af0
Fixing the default value of CCL in infer.py (#725)
vjanfaza 5fdde19
Adding support for multi-node PP+DDP (#726)
smedhe 1f2ac51
Added default NPI file (#657)
quic-akuruvil dcbb7be
Release 1.21 docs (#718)
tv-karthikeya 1ec3975
HOTFIX : Added support for repeat kv heads aligned Bias scaling for A…
quic-dhirajku e61a1a3
Removed OpenGVLab/InternVL2_5-1B and OpenGVLab/InternVL3_5-1B (#736)
quic-rishinr 47a0fec
Qeff versioning (#741)
quic-rishinr 3a8e5e9
Revert "Qeff versioning" (#746)
quic-rishinr 0ffa4ea
Fix for Qwen 2.5 VL with subfunction (#733)
abhishek-singh591 32f30c0
Fixed torch patch for subfunction with VLMs (#750)
abhishek-singh591 eb74758
Added support of subfunction for VLMs (#699)
abhishek-singh591 742b7bd
Updated reduce sum calculation to use einsum for gpt_oss (#754)
asmigosw 5a129c7
Updating pytest config for InternVL (#758)
tv-karthikeya b777e8b
Wan support to skip compilation (#734)
tv-karthikeya 75bf976
Fixing SW issue in Gemma3 (#740)
qcdipankar 3751f7e
Fix documentation of Multinode FT (#764)
quic-akuruvil 27ebe8e
Adding support for gemma3 in continous batching script for CI (#763)
qcdipankar 536e3fc
Subfunction Fix (#766)
abhishek-singh591 f64f703
Mainline version update (#752)
quic-rishinr 1a3e09c
Updated compile from qaic-exec to qaic-compile (#703)
asmigosw e8e5c43
Fix for Diffusers subfunction (#759)
tv-karthikeya fc42332
Added One hot fix for MOE model with subfunction (#777)
abhishek-singh591 544327a
Adding support of QEFFAutoModelForSequenceClassification (#729)
quic-amitraj facae5f
CI test optimization (#751)
quic-rishinr cd25784
Merge remote-tracking branch 'upstream/ft_experimental' into final_hf
tchawada 3f6315c
Adding qaic validation in config manager, default value to prompt_func
tchawada 9015bf6
Adding qaic validation in config manager, default value to prompt_func
tchawada fb28705
Adding a function to check whether NSP for given QAIC is free or not
tchawada 8cbe49e
Adding integrated test for HF_trainer stack
tchawada File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
104 changes: 104 additions & 0 deletions
104
QEfficient/finetune/experimental/core/utils/constants.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,104 @@ | ||
| # ----------------------------------------------------------------------------- | ||
| # | ||
| # Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. | ||
| # SPDX-License-Identifier: BSD-3-Clause | ||
| # | ||
| # ----------------------------------------------------------------------------- | ||
|
|
||
| """ | ||
| Constants used across test files in the experimental finetuning pipeline. | ||
| """ | ||
|
|
||
| from enum import Enum | ||
|
|
||
| # ============================================================================ | ||
| # Enums | ||
| # ============================================================================ | ||
|
|
||
|
|
||
| class TaskType(str, Enum): | ||
| """Task types for model training.""" | ||
|
|
||
| CAUSAL_LM = "CAUSAL_LM" | ||
| SEQ_CLS = "SEQ_CLS" | ||
| SEQ_2_SEQ_LM = "SEQ_2_SEQ_LM" | ||
|
|
||
|
|
||
| class DatasetType(str, Enum): | ||
| """Dataset types for training.""" | ||
|
|
||
| SFT_DATASET = "sft_dataset" | ||
| SEQ_COMPLETION = "seq_completion" | ||
| SEQ_CLASSIFICATION = "seq_classification" | ||
|
|
||
|
|
||
| class AutoClassName(str, Enum): | ||
| """Auto class names for model loading.""" | ||
|
|
||
| CAUSAL_LM = "AutoModelForCausalLM" | ||
| SEQ_CLS = "AutoModelForSequenceClassification" | ||
| SEQ_2_SEQ_LM = "AutoModelForSeq2SeqLM" | ||
|
|
||
|
|
||
| # ============================================================================ | ||
| # Test Seeds and Ratios | ||
| # ============================================================================ | ||
|
|
||
| TEST_SEED = 42 | ||
| TEST_SPLIT_RATIO = 0.8 | ||
|
|
||
| # ============================================================================ | ||
| # PEFT/LoRA Configuration | ||
| # ============================================================================ | ||
|
|
||
| TEST_LORA_R = 8 | ||
| TEST_LORA_ALPHA = 16 | ||
| TEST_LORA_DROPOUT = 0.1 | ||
| TEST_LORA_TARGET_MODULES_LLAMA = ["q_proj", "v_proj"] | ||
| TEST_LORA_TARGET_MODULES_BERT = ["query", "value"] | ||
| TEST_LORA_BIAS = "none" | ||
|
|
||
| # ============================================================================ | ||
| # Training Parameters | ||
| # ============================================================================ | ||
|
|
||
| TEST_LEARNING_RATE = 5e-5 | ||
| TEST_WEIGHT_DECAY = 0.01 | ||
| TEST_WARMUP_STEPS = 5 | ||
| TEST_NUM_TRAIN_EPOCHS = 1 | ||
| TEST_MAX_STEPS = 5 | ||
| TEST_LOGGING_STEPS = 1 | ||
| TEST_PER_DEVICE_BATCH_SIZE = 1 | ||
| TEST_MAX_SEQ_LENGTH_CAUSAL = 256 | ||
| TEST_MAX_SEQ_LENGTH_SEQ_CLS = 128 | ||
| TEST_MAX_LENGTH = 128 | ||
| TEST_NUM_HIDDEN_LAYERS = 2 | ||
|
|
||
| # ============================================================================ | ||
| # Dataset Paths and Names | ||
| # ============================================================================ | ||
|
|
||
| # HuggingFace Dataset Names | ||
| HF_DATASET_ALPACA = "tatsu-lab/alpaca" | ||
| HF_DATASET_GSM8K = "openai/gsm8k" | ||
| HF_DATASET_GSM8K_CONFIG = "main" | ||
| HF_DATASET_IMDB = "stanfordnlp/imdb" | ||
|
|
||
| # Dataset subset size for testing | ||
| TEST_DATASET_SUBSET_SIZE = 10 | ||
|
|
||
| # ============================================================================ | ||
| # Model Names | ||
| # ============================================================================ | ||
|
|
||
| TEST_MODEL_LLAMA = "meta-llama/Llama-3.2-1B" | ||
| TEST_MODEL_SMOLLM = "HuggingFaceTB/SmolLM-135M" | ||
|
|
||
| # ============================================================================ | ||
| # Optimizer Parameters | ||
| # ============================================================================ | ||
|
|
||
| OPT_LEARNING_RATE = 1e-4 | ||
| OPT_ADAM_BETAS = (0.9, 0.999) | ||
| OPT_ADAM_EPS = 1e-8 | ||
| OPT_SGD_MOMENTUM = 0.9 | ||
24 changes: 24 additions & 0 deletions
24
QEfficient/finetune/experimental/preprocessing/alpaca_func.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,24 @@ | ||
| # ----------------------------------------------------------------------------- | ||
| # | ||
| # Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. | ||
| # SPDX-License-Identifier: BSD-3-Clause | ||
| # | ||
| # ----------------------------------------------------------------------------- | ||
| def prompt_no_input(row): | ||
| return ( | ||
| "Below is an instruction that describes a task. " | ||
| "Write a response that appropriately completes the request.\n\n" | ||
| "### Instruction:\n{instruction}\n\n### Response:\n" | ||
| ).format_map(row) | ||
|
|
||
|
|
||
| def prompt_input(row): | ||
| return ( | ||
| "Below is an instruction that describes a task, paired with an input that provides further context. " | ||
| "Write a response that appropriately completes the request.\n\n" | ||
| "### Instruction:\n{instruction}\n\n### Input:\n{input}\n\n### Response:\n" | ||
| ).format_map(row) | ||
|
|
||
|
|
||
| def create_alpaca_prompt(row): | ||
| return prompt_no_input(row) if row["input"] == "" else prompt_input(row) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it mentioned as main?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two subsets of gsm8k dataset, to load main , we need this configuration