vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 4.5k
Star 29.8k

Code
Issues 1.8k
Pull requests 397
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Pull requests: vllm-project/vllm

Labels 55 Milestones 0

New pull request New

397 Open 4,165 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[Model] Add support for Qwen2 for embeddings documentation

Improvements or additions to documentation

ready

ONLY add when PR is ready to merge/full CI is needed

#10184 opened Nov 9, 2024 by DarkLight1337

Loading…

Add docs on serving with Llama Stack documentation

Improvements or additions to documentation

#10183 opened Nov 9, 2024 by terrytangyuan

Loading…

[Bug]: When apply continue_final_message for OpenAI server, the "echo":false is ignored frontend

#10180 opened Nov 9, 2024 by chaunceyjiang • Draft

[Frontend] Add per-request number of cached token stats frontend

#10174 opened Nov 9, 2024 by zifeitong

Loading…

[Docs] Misc updates to TPU installation instructions documentation

Improvements or additions to documentation

#10165 opened Nov 8, 2024 by mikegre-google

Loading…

[Bugfix][Frontend] Update Llama 3.2 Chat Template to support Vision and Non-Tool use

#10164 opened Nov 8, 2024 by tjohnson31415

Loading…

[Doc] Move PR template content to docs ci/build documentation

Improvements or additions to documentation

#10159 opened Nov 8, 2024 by russellb

Loading…

[Core][LoRA]Add LoRA for EncoderDecoderModelRunner needs-rebase

#10143 opened Nov 8, 2024 by jeejeelee • Draft

3 tasks

Fix missing data type in flashinfer prefill

#10141 opened Nov 8, 2024 by reyoung

Loading…

[Feature] [Spec decode]: Enable MLPSpeculator/Medusa and prompt_logprobs with ChunkedPrefill needs-rebase

#10132 opened Nov 7, 2024 by NickLucche • Draft

1 task

[Kernel]Enable HPU for Speculative Decoding

#10131 opened Nov 7, 2024 by xuechendi

Loading…

[Mistral] FP8 format needs-rebase

#10130 opened Nov 7, 2024 by patrickvonplaten • Draft

[WIP] Prefix Cache Aware Scheduling [1/n]

#10128 opened Nov 7, 2024 by rickyyx

Loading…

[V1][Bugfix] Propagate V1 LLMEngine properly ready

ONLY add when PR is ready to merge/full CI is needed

#10127 opened Nov 7, 2024 by comaniac

Loading…

[Core] Add padding-aware scheduling for 2D prefills

#10125 opened Nov 7, 2024 by kzawora-intel

Loading…

[V1] Allow piecewise cuda graphs to run with custom allreduce

#10121 opened Nov 7, 2024 by SageMoore • Draft

Fix quantization config of vl model

#10120 opened Nov 7, 2024 by jinzhen-lin

Loading…

[Hardware][CPU][torch.compile] integrate torch compile needs-rebase

#10113 opened Nov 7, 2024 by bigPYJ1151 • Draft

[Hardware][XPU] AWQ/GPTQ support for xpu backend documentation

Improvements or additions to documentation

needs-rebase

#10107 opened Nov 7, 2024 by yma11

Loading…

[CI/Build] Bump test transformers version ci/build needs-rebase ready

ONLY add when PR is ready to merge/full CI is needed

#10106 opened Nov 7, 2024 by Isotr0py • Draft

[Core/Bugfix] Per FlashInfer API changing data_type to kv_data_type for kv_cache

#10103 opened Nov 7, 2024 by wenscarl

Loading…

[Kernel]Generalize Speculative decode from Cuda

#10094 opened Nov 6, 2024 by xuechendi

Loading…

Splitting attention kernel file ci/build ready

ONLY add when PR is ready to merge/full CI is needed

#10091 opened Nov 6, 2024 by maleksan85

Loading…

[CI/Build] Split up models tests ci/build ready

ONLY add when PR is ready to merge/full CI is needed

#10069 opened Nov 6, 2024 by DarkLight1337

Loading…

[Core] Add dynamic chunk size calculation

#10061 opened Nov 6, 2024 by prashantgupta24

Loading…

Previous 1 2 3 4 5 … 15 16 Next

Previous Next

ProTip! Updated in the last three days: updated:>2024-11-06.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly