PatchSiren

PatchSiren cyber security CVE debrief

CVE-2026-54235 vllm-project CVE debrief

CVE-2026-54235 is a vulnerability in vLLM, an inference and serving engine for large language models (LLMs). The vulnerability allows for undefined behavior or CUDA errors that can crash the inference worker due to improper temperature validation. This issue was fixed in version 0.23.1rc0. The vulnerability has a CVSS score of 6.9 and is classified as MEDIUM severity. The CVE was published on June 22, 2026, and modified on June 24, 2026.

Vendor
vllm-project
Product
vllm
CVSS
MEDIUM 6.9
CISA KEV
Not listed in stored evidence
Original CVE published
2026-06-22
Original CVE updated
2026-06-24
Advisory published
2026-06-22
Advisory updated
2026-06-24

Who should care

Organizations using vLLM for inference and serving large language models should prioritize patching this vulnerability to prevent potential crashes or undefined behavior. This vulnerability could be particularly concerning for environments relying heavily on LLMs for critical tasks. Given the MEDIUM severity and potential for service disruption, defenders should assess their exposure and apply the patch promptly.

Technical summary

The vulnerability in vLLM arises from the use of comparison operators (<, >) for temperature validation, which can silently evaluate to False for NaN (Not a Number) and positive Infinity in Python's IEEE 754 float semantics. As a result, both NaN and positive Infinity values pass the validation guards and propagate to GPU sampling kernels. There, they can produce undefined behavior or trigger CUDA errors, potentially crashing the inference worker. The issue is addressed in version 0.23.1rc0, where the validation logic was presumably corrected to handle these edge cases properly. This highlights the importance of robust input validation, especially when dealing with floating-point numbers that can have special values like NaN and Infinity.

Defensive priority

Defenders should prioritize patching this vulnerability, given its potential impact on service availability and the MEDIUM severity rating. Applying the fix from version 0.23.1rc0 should mitigate the risk of inference worker crashes due to this issue.

Recommended defensive actions

  • Apply the patch from version 0.23.1rc0 to fix the temperature validation issue.
  • Review and update affected systems using vLLM for large language model inference and serving.
  • Monitor for any unusual behavior or errors in inference workers following the patch application.
  • Consider validating input data for NaN and Infinity values as an additional precaution.
  • Update inventory records to reflect patched systems and verify configuration compliance.

Evidence notes

The CVE-2026-54235 vulnerability details were obtained from the NVD and CVE.org records. The issue is related to the vLLM project, which is an inference and serving engine for large language models. The vulnerability allows for undefined behavior or CUDA errors due to improper handling of NaN and positive Infinity in temperature validation. The fix is included in version 0.23.1rc0 of the vLLM project. The CVSS score for this vulnerability is 6.9, indicating a MEDIUM severity level.

Official resources

This article is AI-assisted and based on the supplied source corpus.