Untriaged
Vllm: denials of service in vllm json web api
A vulnerability was found in the ilab model serve component, where improper handling of the best_of parameter in the vllm JSON web API can lead to a Denial of Service (DoS). The API used for LLM-based sentence or chat completion accepts a best_of parameter to return the best completion from several options. When this parameter is set to a large value, the API does not handle timeouts or resource exhaustion properly, allowing an attacker to cause a DoS by consuming excessive system resources. This leads to the API becoming unresponsive, preventing legitimate users from accessing the service.
Affected products
vllm
- <0.5.0.post1
rhelai1/bootc-nvidia-rhel9
rhelai1/instructlab-nvidia-rhel9
Matching in nixpkgs
pkgs.vllm
High-throughput and memory-efficient inference and serving engine for LLMs
-
nixos-unstable -
- nixpkgs-unstable 0.10.1.1
pkgs.python312Packages.vllm
High-throughput and memory-efficient inference and serving engine for LLMs
-
nixos-unstable -
- nixpkgs-unstable 0.10.1.1
Package maintainers
-
@happysalada Raphael Megzari <raphael@megzari.com>
-
@CertainLach Yaroslav Bolyukin <iam@lach.pw>