Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training on Pretrained LISA Model #159

Open
Mactor2018 opened this issue Nov 12, 2024 · 1 comment
Open

Training on Pretrained LISA Model #159

Mactor2018 opened this issue Nov 12, 2024 · 1 comment

Comments

@Mactor2018
Copy link

Hello, and thank you for your excellent work! I am currently working on training a downstream task using LISA, but I’m unsure about the correct approach for training on the pretrained LISA model.

I’ve attempted to use your pretrained LISA model as the Llama2 model, and I've also implemented SAM ViT-H pretrained weights. Here’s the command I used:

deepspeed --master_port=24999  --include localhost:0,1  train_ds.py \
  --version="./model/LISA-13B-llama2-v1/" \
  --dataset_dir='./dataset/' \
  --vision_pretrained="./model/sam_vit_to_train/sam_vit_h.pth" \
  # other args...

However, I noticed that the model's performance isn’t very good after training for 5 epochs, which makes me wonder if I have correctly set up training with your pretrained model.

Could you please confirm if this setup is correct or suggest any modifications? Thank you!

@Mactor2018
Copy link
Author

I checked and compared the train_ds.py and chat.py and find that the training code might have to be modified to load the llama and SAM from one source together, like that in inference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant