Data Governance

v2.4.0 | Report Errata

docs development docs development

Data governance addresses the EU AI Act’s Article 10 requirements for training, validation, and testing data. This section covers 43 articles across nine subsections, spanning the full data lifecycle from initial documentation through bias assessment, mitigation, lineage tracking, and specialised governance for RAG architectures.

The subsections are organised to mirror the compliance workflow. Dataset documentation establishes provenance and composition. Completeness assessment evaluates representativeness. Pre-training bias assessment examines data for distributional imbalances, label bias, and proxy variables before any model is trained. Post-training bias evaluation applies five fairness metrics to the trained model’s outputs. Bias mitigation documents the techniques applied and their effectiveness.

Data lineage and version control ensure every transformation is traceable and every dataset version retrievable for ten years. Special category data handling addresses the Article 10(5) provisions for processing sensitive personal data in bias detection. RAG-specific governance extends Article 10’s requirements to knowledge bases, embeddings, and multilingual performance. The section concludes with the artefacts produced.

Note:

This section corresponds to the Data Governance section and feeds primarily into AISDP Module 4 (Data Governance and Dataset Documentation).