Vector Database Security — Write/Read Separation

v2.4.0 | Report Errata

docs security docs security

Systems using retrieval-augmented generation, semantic search, or embedding-based matching store dense vector embeddings in specialised databases (Pinecone, Weaviate, Qdrant, Milvus, pgvector, Chroma). Access control must enforce separation between write access (used during knowledge base indexing) and read access (used during inference-time retrieval).

The indexing pipeline authenticates as a dedicated service identity with write permissions; the inference service authenticates as a separate identity with read-only permissions. Administrative operations (index deletion, schema changes, bulk exports) require elevated privileges and produce audit log entries. This separation ensures that a compromised inference service cannot modify the knowledge base, and a compromised indexing pipeline cannot exfiltrate query patterns.

Encryption at rest protects stored embeddings. As discussed in Article 97 (GDPR status of stored embeddings), embeddings derived from documents containing personal data may themselves constitute personal data under GDPR. The encryption, retention, and deletion requirements that apply to the source documents therefore extend to the embeddings. The vector database security configuration is documented in Module 9.

Key outputs

Write/read access separation with dedicated service identities
Elevated privilege requirements for administrative operations
Encryption at rest for stored embeddings
Module 9 AISDP documentation