OWASP LLM05: Improper Output Handling

v2.4.0 | Report Errata

docs security docs security

Insecure Output — Attack Vectors (XSS, SQL Injection, Command Injection via Output)

Model outputs that are passed to downstream systems without validation can trigger secondary vulnerabilities. If model outputs are rendered in web interfaces, they may contain cross-site scripting (XSS) payloads. If outputs are incorporated into database queries, they may contain SQL injection payloads. If outputs are passed to system shells, they may contain command injection payloads.

This threat is distinct from the model itself being compromised; it arises from the way downstream systems consume model outputs. A model that produces correct, well-intentioned outputs can still introduce vulnerabilities if those outputs happen to contain characters or patterns that downstream systems interpret as executable code. For generative models, this risk is particularly acute because the model’s output is free-form text that may contain any character sequence.

The threat assessment should identify every downstream system that consumes the model’s output and evaluate the injection risk for each consumption path. Web rendering, database queries, file system operations, email generation, and API calls to other services are all potential injection vectors. The assessment feeds into Module 9 and informs the output handling controls described above.

Key outputs

Downstream system inventory with injection risk per consumption path
XSS, SQL injection, and command injection vector assessment
Integration with the overall threat model
Module 9 and Module 3 AISDP documentation

Insecure Output — Controls (Untrusted Treatment, Encoding, Schema Validation, Sandboxing)

The foundational principle is that all model outputs are treated as untrusted input by downstream systems. This architectural decision eliminates an entire class of vulnerabilities by ensuring that no downstream component assumes model outputs are safe.

Output encoding ensures that special characters in model outputs are escaped before they reach consuming systems. For web rendering, HTML entity encoding prevents XSS. For database operations, parameterised queries prevent SQL injection. For system commands, shell escaping or, preferably, avoiding shell invocation entirely prevents command injection. Schema validation, enforced by a dedicated validation middleware on the inference output path, verifies that every output conforms to the expected structure before it is passed downstream. Outputs that fail validation are replaced with safe default responses and the failure is logged.

Sandboxed execution environments provide a final layer of defence. If model outputs must be executed (for example, code generation systems), the execution occurs in an isolated sandbox with no access to production systems, no network access, and resource limits. The output validation layer is enforced at the infrastructure level, not within the model’s code, ensuring it cannot be bypassed. Module 3 should describe the output validation layer as a distinct architectural component.

Key outputs

Untrusted-output architectural principle documented in Module 3
Output encoding per downstream consumption path
Schema validation middleware on the inference output path
Sandboxed execution for code-generating systems