Training on Pretrained LISA Model #159

Mactor2018 · 2024-11-12T03:51:35Z

Hello, and thank you for your excellent work! I am currently working on training a downstream task using LISA, but I’m unsure about the correct approach for training on the pretrained LISA model.

I’ve attempted to use your pretrained LISA model as the Llama2 model, and I've also implemented SAM ViT-H pretrained weights. Here’s the command I used:

deepspeed --master_port=24999  --include localhost:0,1  train_ds.py \
  --version="./model/LISA-13B-llama2-v1/" \
  --dataset_dir='./dataset/' \
  --vision_pretrained="./model/sam_vit_to_train/sam_vit_h.pth" \
  # other args...

However, I noticed that the model's performance isn’t very good after training for 5 epochs, which makes me wonder if I have correctly set up training with your pretrained model.

Could you please confirm if this setup is correct or suggest any modifications? Thank you!

Mactor2018 · 2024-11-12T04:15:36Z

I checked and compared the train_ds.py and chat.py and find that the training code might have to be modified to load the llama and SAM from one source together, like that in inference.

Mactor2018 mentioned this issue Nov 12, 2024

Can i use 4 24 G GPU for this project fine tuning #147

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training on Pretrained LISA Model #159

Training on Pretrained LISA Model #159

Mactor2018 commented Nov 12, 2024

Mactor2018 commented Nov 12, 2024

Training on Pretrained LISA Model #159

Training on Pretrained LISA Model #159

Comments

Mactor2018 commented Nov 12, 2024

Mactor2018 commented Nov 12, 2024