Infrastructure Design

v2.4.0 | Report Errata

docs development docs development

Cloud Deployment (Multi-AZ, Multi-Region)

For cloud-hosted high-risk AI systems, the AISDP must specify the cloud provider, the deployment region, the specific services used, and the instance types and resource allocations for inference workloads. Where the system processes personal data, deployment should take place within the EU/EEA unless a valid GDPR Chapter V transfer mechanism is in place; adequacy decisions under GDPR Article 45 (including the EU-US Data Privacy Framework for certified US organisations), standard contractual clauses under GDPR Article 46(2)©, and binding corporate rules under GDPR Article 47 may each provide a lawful basis for third-country processing. The chosen mechanism, along with an assessment of its suitability for the specific data categories and processing activities, must be documented in the AISDP.

Multi-availability-zone deployment ensures that the system survives the failure of a single data centre within a region. Multi-region deployment provides resilience against the failure of an entire cloud region, though it introduces additional complexity around data consistency and latency. The AISDP must document the resilience architecture, including the failover mechanism (active-active or active-passive), the expected failover time, and the data consistency model during failover.

Cloud provider data processing agreements must be in place and referenced in the AISDP. These agreements confirm that the provider processes data only on documented instructions, that appropriate security measures are in place, and that the provider assists with GDPR compliance obligations. For systems using managed AI services (SageMaker, Vertex AI, Azure Machine Learning, Databricks), the AISDP must additionally document which managed services are used, what data flows through them, the provider’s data handling practices, availability SLAs, and fallback strategies.

Key outputs

Cloud deployment specification (provider, region, services, instance types)
Resilience architecture documentation (multi-AZ, multi-region, failover)
Cloud provider data processing agreements
Managed AI service dependency documentation

Containerisation

Containerisation with Docker and orchestration with Kubernetes provide the infrastructure for reproducible, versioned deployment environments. A Docker container image is immutable once built: it captures the exact operating system, libraries, framework versions, and application code that constitute the runtime environment. This immutability is valuable for compliance because the container image tested during conformity assessment is exactly the image that runs in production.

Module 3 of the AISDP captures the container image build process (Dockerfile, base images, build arguments), the container registry (a private registry with access controls and image signing), the orchestration configuration (Kubernetes manifests, Helm charts, deployment strategies), and resource limits and scaling policies. Each container image is tagged with the corresponding code and model version and stored in the private registry with access logging.

The supply chain risk in containerisation is the base image. A container built from python:3.11-slim inherits whatever is in that base image at build time. The mitigation is to pin the base image to a specific digest (a SHA-256 hash), scan the built image for vulnerabilities with Trivy, Grype, or Snyk Container, sign it with Docker Content Trust or Sigstore cosign, and store it in a private registry such as Harbor, AWS ECR, Azure ACR, or Google Artifact Registry. The CI pipeline should fail if the container scan reveals critical or high-severity vulnerabilities.

Key outputs

Dockerfiles with pinned base image digests
Private container registry with access controls and image signing
Kubernetes manifests or Helm charts for orchestration
Container vulnerability scan results as Module 3 evidence

Edge Deployment Considerations

Systems embedded in physical products, deployed in air-gapped environments, or operating at the network edge present distinct infrastructure documentation requirements. The AISDP must specify the target hardware platform, the model optimisation techniques used (quantisation to INT8 or FP16, structured pruning, knowledge distillation, framework-specific compilation with TensorRT, ONNX Runtime, TensorFlow Lite, or Core ML) and their measured impact on accuracy and fairness.

Each optimisation technique alters the model’s behaviour to some degree. The AISDP must document which techniques were applied, the performance and fairness evaluation results for the optimised model (not merely the original), and any subgroups for which the optimisation disproportionately affects accuracy. The evaluation of the optimised model, not the pre-optimisation model, is the compliance-relevant evidence.

Edge-deployed models require an over-the-air update mechanism supporting four capabilities: version verification through cryptographic signatures, rollback capability if on-device validation fails, staged rollout to a subset of devices before full fleet deployment, and update logging recording which version each device runs. Logging and monitoring in disconnected environments requires on-device log buffering, log integrity during buffering, batch upload when connectivity is restored, and graceful degradation of monitoring capabilities. Physical security measures (tamper-evident enclosures, secure boot, encrypted storage) must also be documented.

Key outputs

Target hardware and optimisation technique documentation
Performance and fairness evaluation of the optimised model
Over-the-air update mechanism specification
Disconnected monitoring and physical security documentation

Data Sovereignty & Residency

Organisations deploying across multiple EU member states must map each data category to its residency constraints and document how the infrastructure enforces those constraints. Different member states may impose data residency requirements through national legislation, sector-specific regulation, or competent authority guidance. Health data in certain jurisdictions must remain within the member state’s borders; financial data may be subject to sector-specific localisation requirements.

The data sovereignty analysis must distinguish between training data flows and inference data flows. Training typically occurs in a central location; inference occurs wherever the system is deployed. For systems where inference inputs contain personal data, the inference infrastructure must process the data within the jurisdiction where the data subject resides, or the organisation must have a valid GDPR Chapter V transfer mechanism.

Deploying the same model across multiple jurisdictions also raises the question of whether the model itself constitutes a data transfer. If the model was trained on personal data from one jurisdiction and deployed in another, regulators may consider the model’s learned parameters to be derived personal data. The EDPB’s guidance on anonymisation techniques (Opinion 05/2014) and the Recital 26 “means reasonably likely to be used” test provide the current analytical framework for assessing whether learned parameters retain personal data character. The organisation should document its position on this question and the analysis supporting it, referencing the applicable framework. Systems deployed across multiple regions must also validate inference consistency, running a standardised test suite across all deployment regions to verify that outputs are identical or within defined tolerance bounds.

Key outputs

Data residency mapping per data category and jurisdiction
Infrastructure enforcement mechanisms (region-locked storage policies)
Cross-border model deployment analysis
Multi-region inference consistency validation results

Disaster Recovery Planning (RTO, DR Region)

High-risk AI systems must be resilient to infrastructure failures. Module 3 captures the recovery point objective (RPO) and recovery time objective (RTO), the backup strategy for model artefacts, configuration, and critical data, the failover architecture, the disaster recovery testing schedule and results, and the degraded-mode behaviour.

For systems where the AI component is safety-critical, the failsafe behaviour must be explicitly documented: when the AI system fails, what default behaviour takes over? For a recruitment screening system, the failsafe might be routing all applications to human review. For a medical diagnostic system, the failsafe might be displaying a warning that the AI assessment is unavailable. The choice of failsafe behaviour is a design decision with compliance implications, as it determines how the system behaves when it cannot fulfil its intended purpose.

The disaster recovery plan is tested periodically, and the test results are retained as Module 3 evidence. The test should verify that the system can be restored within the declared RTO, that the restored system serves the correct model version, that no data is lost beyond the declared RPO, and that the failsafe behaviour activates correctly when the primary system is unavailable.

Key outputs

RPO and RTO specifications
Failover architecture and failsafe behaviour documentation
Disaster recovery test schedule and results
Module 3 and Module 9 AISDP evidence