Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

feat(serve): Add /readiness/stages endpoint for model-ready autoscaling documentation Improvements or additions to documentation frontend
#45551 opened Jun 13, 2026 by zhihuidu-amd Loading…
Fix fp8 cache gather block table bounds
#45550 opened Jun 13, 2026 by johnnyychiu Loading…
Fix gpt-oss required tool choice constraints frontend gpt-oss Related to GPT-OSS models tool-calling
#45547 opened Jun 13, 2026 by johnnyychiu Loading…
[Bug Fix] [MiniMax-M3] Implement EAGLE3 support on the AMD MiniMax M3 bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm
#45546 opened Jun 13, 2026 by functionstackx Loading…
Add Kimi video chunk splitting frontend multi-modality Related to multi-modality (#4194)
#45545 opened Jun 13, 2026 by maxlang Loading…
[Bugfix] Default tie_weights to sharing the weight (fix tied quantized embeddings, e.g. ModelOpt Gemma) bug Something isn't working
#45544 opened Jun 13, 2026 by mikekg Contributor Loading…
3 of 4 tasks
[Bugfix] CuMemAllocator: zero discarded-tag pages on wake_up bug Something isn't working
#45542 opened Jun 13, 2026 by terafin Loading…
1 of 2 tasks
Fix moe_wna16 sorted token bounds
#45539 opened Jun 13, 2026 by johnnyychiu Loading…
Fix cp_gather_cache block table bounds
#45537 opened Jun 13, 2026 by johnnyychiu Loading…
[Model] Wire quant_config/prefix into input embeddings for GPTNeoX and Llama llama Related to Llama models
#45535 opened Jun 13, 2026 by KKothuri Loading…
[BugFix] Fix clang spinloop mwaitx include bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed verified Run pre-commit for new contributors without triggering other tests
#45532 opened Jun 13, 2026 by johnnyychiu Loading…
[Bugfix] Bounds-check moe_permute reverse-map write (#45492) bug Something isn't working
#45530 opened Jun 13, 2026 by waynehacking8 Contributor Loading…
[v1] Initialize InputBatch in initialize_kv_cache instead of __init__ v1
#45528 opened Jun 13, 2026 by wenyili Contributor Loading…
[Bugfix] Fix HF-format Mistral3 crash with --tokenizer-mode mistral bug Something isn't working mistral Related to Mistral models multi-modality Related to multi-modality (#4194)
#45525 opened Jun 13, 2026 by discobot Loading…
3 of 4 tasks
[Core] Expose engine pause/resume state as prometheus metrics v1
#45524 opened Jun 13, 2026 by dangoldbj Contributor Loading…
4 tasks
fix: cache bad_words tokenization to avoid 'Already borrowed' errors under concurrency
#45522 opened Jun 13, 2026 by Oxygen56 Contributor Loading…
3 tasks done
ProTip! Follow long discussions with comments:>50.