v2.4.0 | Report Errata
docs security docs security

Adversarial document injection is a novel attack surface introduced by vector databases. An attacker who can insert documents into the knowledge base can craft documents designed to be retrieved for specific target queries, allowing indirect manipulation of the LLM’s output without modifying the model itself. A document containing misleading safety information, crafted to be semantically similar to common queries about a product, would be retrieved and presented to the LLM as authoritative context.

Controls include strict access control on the indexing pipeline, content validation and provenance verification for all documents entering the knowledge base, anomaly detection on newly indexed documents (flagging documents whose embeddings are unusually close to high-frequency queries), and monitoring retrieval patterns for sudden changes in which documents are being retrieved for stable queries.

The adversarial document injection threat and its controls are documented in the threat model and in Module 9. Testing should include attempts to inject crafted documents through the indexing pipeline and verification that the controls detect and reject them.

Key outputs

  • Content validation and provenance verification on all indexed documents
  • Anomaly detection on newly indexed document embeddings
  • Retrieval pattern monitoring for sudden changes
  • Module 9 AISDP documentation
On This Page