Cannot Convert Checkpint to Trainable Model #133

believewhat · 2023-11-12T04:53:12Z

Hi authors,

When I am trying to merge checkpoint (lora) with base model(merge_lora_weights_and_save_hf_model.py) I encounter this issue:

You are resizing the embedding layer without providing a pad_to_multiple_of parameter. This means that the new embedding dimension will be 32001. This might induce some performance reduction as Tensor Cores will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc

And I cannot get the final model.

The text was updated successfully, but these errors were encountered:

weicheng113 · 2023-11-13T02:00:22Z

It seems the above is a warning or so. The code still runs with the above output in my end.

believewhat · 2023-11-16T14:48:55Z

But I only got one checkpoint, not 15 checkpoints for llama70b(only pytorch_model.bin, not pytorch_model-00001-of-00015.bin...).

believewhat · 2023-11-18T04:11:45Z

Anotehr problem is that I found the size of adapter_model.bin is only 500B.

Provide feedback