Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vllm mode shows much slow inference speed than normal (hf) mode. #2418

Open
95jinchul opened this issue Oct 21, 2024 · 0 comments
Open

vllm mode shows much slow inference speed than normal (hf) mode. #2418

95jinchul opened this issue Oct 21, 2024 · 0 comments

Comments

@95jinchul
Copy link

below evaluation is vllm settings for llama3.2 evaluation


lm_eval --model vllm \
    --model_args pretrained=/home/jovyan/data-vol-1/models/meta-llama__Llama3.2-1B-Instruct,dtype=auto,gpu_memory_utilization=0.8 \
    --tasks leaderboard\
    --batch_size auto\
    --output_path result/meta-llama__Llama3.2-1B-Instruct.json


Processed prompts:  14%|███▉                         | 20791/152671 [15:38<1:23:11, **26.42it/s, est. speed input: 24555.79 toks/s,** output: 22.14 toks/s]

below evaluation is hf settings for llama3.2 evaluation

lm_eval --model hf \
    --model_args pretrained=/home/jovyan/data-vol-1/models/meta-llama__Llama3.2-1B-Instruct,dtype=auto \
    --tasks leaderboard\
    --batch_size auto\
    --output_path result/meta-llama__Llama3.2-1B-Instruct_hf.json

Passed argument batch_size = auto:1. Detecting largest batch size
Starting from v4.46, the `logits` model output will have the same type as the model (except at train time, where it will always be FP32)
Determined largest batch size: 4
Running loglikelihood requests:   8%|█████▊                                                                    | 12011/152671 [01:23<11:20, 206.83it/s]

This is shows not only loglikelihood request, but also generation request.
Is this wrong with my evaluation setting?. I using A100 80GB single-GPU.

Thanks for help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant