-
-
Notifications
You must be signed in to change notification settings - Fork 18k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
feat(serve): Add /readiness/stages endpoint for model-ready autoscaling
documentation
Improvements or additions to documentation
frontend
#45551
opened Jun 13, 2026 by
zhihuidu-amd
Loading…
Fix LMCache MP LoRA cache salt keying
kv-connector
v1
#45549
opened Jun 13, 2026 by
johnnyychiu
Loading…
[Chore] Consolidate reasoning/tool parser attributes into unified Parser in chat serving
frontend
#45548
opened Jun 13, 2026 by
sfeng33
Collaborator
Loading…
Fix gpt-oss required tool choice constraints
frontend
gpt-oss
Related to GPT-OSS models
tool-calling
#45547
opened Jun 13, 2026 by
johnnyychiu
Loading…
[Bug Fix] [MiniMax-M3] Implement EAGLE3 support on the AMD MiniMax M3
bug
Something isn't working
ready
ONLY add when PR is ready to merge/full CI is needed
rocm
Related to AMD ROCm
#45546
opened Jun 13, 2026 by
functionstackx
Loading…
Add Kimi video chunk splitting
frontend
multi-modality
Related to multi-modality (#4194)
#45545
opened Jun 13, 2026 by
maxlang
Loading…
[Bugfix] Default tie_weights to sharing the weight (fix tied quantized embeddings, e.g. ModelOpt Gemma)
bug
Something isn't working
#45544
opened Jun 13, 2026 by
mikekg
Contributor
Loading…
3 of 4 tasks
[Bugfix] CuMemAllocator: zero discarded-tag pages on wake_up
bug
Something isn't working
#45542
opened Jun 13, 2026 by
terafin
Loading…
1 of 2 tasks
[Bugfix] Fix SamplingParams repr, docstrings, top_k validation order, and convert StructuredOutputsParams to msgspec.Struct
bug
Something isn't working
#45541
opened Jun 13, 2026 by
shernshiou
Loading…
1 of 4 tasks
[Model] Wire quant_config/prefix into input embeddings for GPTNeoX and Llama
llama
Related to Llama models
#45535
opened Jun 13, 2026 by
KKothuri
Loading…
[BugFix] Fix clang spinloop mwaitx include
bug
Something isn't working
ready
ONLY add when PR is ready to merge/full CI is needed
verified
Run pre-commit for new contributors without triggering other tests
#45532
opened Jun 13, 2026 by
johnnyychiu
Loading…
[Bugfix] Bounds-check moe_permute reverse-map write (#45492)
bug
Something isn't working
#45530
opened Jun 13, 2026 by
waynehacking8
Contributor
Loading…
fix: resolve SyntaxWarning for invalid escape sequences in OpenAI protocol files
frontend
#45529
opened Jun 13, 2026 by
codewithyug06
Loading…
[v1] Initialize InputBatch in initialize_kv_cache instead of __init__
v1
#45528
opened Jun 13, 2026 by
wenyili
Contributor
Loading…
[Bugfix][Kernel] Fix int32/uint overflow in merge_attn_states and permute_cols
bug
Something isn't working
#45527
opened Jun 13, 2026 by
techmonk000
Loading…
[Frontend] Add /health/decode endpoint for engine forward-progress liveness (re-opens #45097)
frontend
v1
#45526
opened Jun 13, 2026 by
terafin
Loading…
[Bugfix] Fix HF-format Mistral3 crash with --tokenizer-mode mistral
bug
Something isn't working
mistral
Related to Mistral models
multi-modality
Related to multi-modality (#4194)
#45525
opened Jun 13, 2026 by
discobot
Loading…
3 of 4 tasks
[Core] Expose engine pause/resume state as prometheus metrics
v1
#45524
opened Jun 13, 2026 by
dangoldbj
Contributor
Loading…
4 tasks
[Docs] Document NIXL KV connector metrics aggregation semantics for TP > 1
documentation
Improvements or additions to documentation
kv-connector
#45523
opened Jun 13, 2026 by
Sugumaran-Balasubramaniyan
Loading…
fix: cache bad_words tokenization to avoid 'Already borrowed' errors under concurrency
#45522
opened Jun 13, 2026 by
Oxygen56
Contributor
Loading…
3 tasks done
Revert "[Render] Add
/derender endpoints for disaggregated postprocessing" (#43606)
frontend
#45521
opened Jun 13, 2026 by
vllm-agent
Contributor
•
Draft
Previous Next
ProTip!
Follow long discussions with comments:>50.