-
-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Model] Add support for Qwen2 for embeddings
documentation
Improvements or additions to documentation
ready
ONLY add when PR is ready to merge/full CI is needed
#10184
opened Nov 9, 2024 by
DarkLight1337
Loading…
Add docs on serving with Llama Stack
documentation
Improvements or additions to documentation
#10183
opened Nov 9, 2024 by
terrytangyuan
Loading…
[Bug]: When apply continue_final_message for OpenAI server, the "echo":false is ignored
frontend
#10180
opened Nov 9, 2024 by
chaunceyjiang
•
Draft
[Frontend] Add per-request number of cached token stats
frontend
#10174
opened Nov 9, 2024 by
zifeitong
Loading…
[Docs] Misc updates to TPU installation instructions
documentation
Improvements or additions to documentation
#10165
opened Nov 8, 2024 by
mikegre-google
Loading…
[Bugfix][Frontend] Update Llama 3.2 Chat Template to support Vision and Non-Tool use
#10164
opened Nov 8, 2024 by
tjohnson31415
Loading…
[Doc] Move PR template content to docs
ci/build
documentation
Improvements or additions to documentation
#10159
opened Nov 8, 2024 by
russellb
Loading…
[Feature] [Spec decode]: Enable MLPSpeculator/Medusa and
prompt_logprobs
with ChunkedPrefill
needs-rebase
#10132
opened Nov 7, 2024 by
NickLucche
•
Draft
1 task
[V1][Bugfix] Propagate V1 LLMEngine properly
ready
ONLY add when PR is ready to merge/full CI is needed
#10127
opened Nov 7, 2024 by
comaniac
Loading…
[Core] Add padding-aware scheduling for 2D prefills
#10125
opened Nov 7, 2024 by
kzawora-intel
Loading…
[Hardware][CPU][torch.compile] integrate torch compile
needs-rebase
#10113
opened Nov 7, 2024 by
bigPYJ1151
•
Draft
[Hardware][XPU] AWQ/GPTQ support for xpu backend
documentation
Improvements or additions to documentation
needs-rebase
#10107
opened Nov 7, 2024 by
yma11
Loading…
[CI/Build] Bump test transformers version
ci/build
needs-rebase
ready
ONLY add when PR is ready to merge/full CI is needed
[Core/Bugfix] Per FlashInfer API changing data_type to kv_data_type for kv_cache
#10103
opened Nov 7, 2024 by
wenscarl
Loading…
Splitting attention kernel file
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#10091
opened Nov 6, 2024 by
maleksan85
Loading…
[CI/Build] Split up models tests
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#10069
opened Nov 6, 2024 by
DarkLight1337
Loading…
Previous Next
ProTip!
Updated in the last three days: updated:>2024-11-06.