Reference Guide

AI Governance Frameworks

A practical reference for building AI systems that are safe, compliant, explainable, and trustworthy — covering the NIST AI RMF, the EU AI Act, Responsible AI principles, and the governance controls that matter specifically for agentic systems.

01 Why AI governance matters

The shift from generative AI to agentic AI changes the stakes — autonomous systems take real actions on real systems, and "the model said so" is no longer an acceptable audit trail.

Trust

Users and regulators need confidence that AI behaves predictably and within stated bounds.

Compliance

Laws like the EU AI Act now carry penalties of up to 7% of global revenue for non-compliance.

Risk

Bias, hallucinations, prompt injection, data leakage, and unsafe tool calls are real production risks.

Governance is product work, not paperwork.
Effective AI governance shows up in code: policy engines, audit logs, RBAC, output validators, and human-in-the-loop checkpoints — not just a PDF policy document.

02 NIST AI RMF

NIST AI Risk Management Framework — issued by the U.S. National Institute of Standards and Technology. A voluntary, sector-agnostic framework for managing AI risks systematically across the AI lifecycle.

Purpose

Provide organizations with a structured way to identify, measure, and reduce risks from AI systems — covering bias, hallucinations, privacy violations, lack of explainability, robustness failures, security weaknesses, and inadequate human oversight.

Four core functions

Govern
Establish policies, accountability, oversight roles, and risk tolerance across the organization.
Map
Identify the context, intended use, stakeholders, and potential risks of each AI system.
Measure
Evaluate the system using qualitative and quantitative metrics — accuracy, fairness, robustness, drift.
Manage
Prioritize, treat, and continuously monitor risks — incident response, decommissioning, retraining.

Focus areas

Bias Hallucinations Privacy Explainability Robustness Security Human oversight Validity & reliability Accountability Transparency

When to use NIST AI RMF

03 EU AI Act

The European Union's landmark AI law — the world's first comprehensive horizontal regulation of AI systems. It classifies AI by risk level and imposes obligations proportional to the risk.

Risk-based classification

Unacceptable
Banned outright. Social scoring by governments, real-time biometric ID in public (with narrow exceptions), manipulative or exploitative AI targeting vulnerable groups.
High Risk
Heavy obligations. AI in healthcare diagnostics, hiring, credit scoring, education admissions, law enforcement, biometric identification, critical infrastructure, judicial decisions.
Limited Risk
Transparency obligations. Chatbots, emotion recognition, deepfakes — users must be informed they are interacting with AI or viewing AI-generated content.
Minimal Risk
No obligations. Spam filters, AI in video games, basic recommender systems. Voluntary codes of conduct encouraged.

Requirements for high-risk AI

RequirementWhat it means in practice
Risk management systemContinuous identification, evaluation, and mitigation of foreseeable risks across the lifecycle.
Data governanceTraining, validation, and test data must be relevant, representative, free of errors, and complete.
Technical documentationDetailed dossier covering system design, training data, performance, and known limitations — kept up to date.
Logging & auditabilityAutomatic recording of events sufficient to trace system behavior throughout its lifetime.
TransparencyUsers informed they are using AI; instructions for use must be clear, complete, and accessible.
Human oversightHumans must be able to monitor, intervene, override, or shut down the system.
Accuracy, robustness, securityThe system must perform consistently and resist adversarial inputs, errors, and unauthorized access.
Conformity assessmentPre-market evaluation (often involving notified bodies) before placing the system on the EU market.
Post-market monitoringOngoing performance tracking and incident reporting after deployment.

General-purpose AI (GPAI)

Foundation models like GPT-class LLMs face additional obligations: technical documentation, copyright compliance for training data, and — for models with systemic risk — model evaluations, adversarial testing, incident reporting, and cybersecurity protections.

!
Penalties have teeth.
Up to €35M or 7% of worldwide annual turnover for prohibited AI violations — higher than GDPR's 4% ceiling.

04 Responsible AI (RAI)

Responsible AI is the broader philosophy and practice — adopted by Microsoft, Google, OpenAI, IBM, Anthropic, and most enterprise AI teams. While NIST is a framework and the EU AI Act is law, RAI is a set of principles operationalized into engineering practice.

Common RAI principles

Fairness

Avoid discrimination and bias across protected groups. Audit training data, evaluate disparate impact, and retrain when fairness metrics regress.

Transparency

Users and stakeholders understand how decisions are made — model cards, system cards, plain-language disclosures.

Accountability

A named human (or team) is responsible for the AI system's behavior, including incidents and remediation.

Privacy

Protect user data through minimization, encryption, retention limits, differential privacy where feasible.

Safety

Prevent harmful outputs and unsafe actions — content filters, refusal training, red-team evaluations.

Reliability

Stable and robust operation across distributions, languages, edge cases — and graceful degradation when uncertain.

Human oversight

Humans remain in the loop for consequential decisions — override, audit, escalate, and approve.

How big tech operationalizes RAI

OrgPublic artifactWhat it covers
MicrosoftResponsible AI StandardSix principles + impact assessment template + Office of Responsible AI review for high-risk uses.
GoogleAI Principles + SAIFSeven principles + Secure AI Framework for ML supply chain security.
OpenAIUsage Policies + System CardsPer-model evaluation reports covering safety, bias, refusals, jailbreak resistance.
AnthropicResponsible Scaling PolicyAI Safety Levels (ASL) tied to capability thresholds and required mitigations.
IBMwatsonx.governanceTooling for model lifecycle, bias detection, drift monitoring, explainability.

05 Framework comparison

The three are complementary, not competing. NIST gives you vocabulary, the EU AI Act gives you obligations, RAI gives you the principles your engineering culture lives by.

DimensionNIST AI RMFEU AI ActResponsible AI
TypeVoluntary risk frameworkBinding law & regulationPrinciples / engineering practice
Issued byNIST (US)European UnionIndustry / standards bodies
Main focusRisk management lifecycleLegal compliance & market accessEthical & trustworthy AI
EnforcementNone (voluntary)Fines up to 7% global revenueInternal / reputational
ScopeAll AI systemsAI placed on EU market or affecting EU personsOrg-wide policy
Key outputRisk profiles & mitigationsConformity assessment & CE markingModel cards, impact assessments
Best forRisk vocabulary & governanceEU market / regulated industriesDay-to-day eng practice
In practice, you'll use all three.
Adopt NIST functions for your internal risk vocabulary, map them onto EU AI Act articles for compliance evidence, and operationalize them through your RAI principles in code and process.

06 Governance for agentic AI

Agents change the threat model. A chatbot returns text; an agent calls APIs, modifies databases, sends emails, deploys code, moves money. Governance must shift from "what did the model say" to "what did the system do."

What makes agents different

They take actions

Agents call tools, hit APIs, write to files, invoke other agents — every action has a real-world side effect that must be authorized, logged, and reversible where possible.

They have memory

Persistent memory introduces data-leakage and prompt-injection vectors that single-turn LLMs don't have. Memory must be governed like any other data store.

They make decisions

Agents choose which tools to call, in what order, with what arguments — those choices need observability, evaluation, and override paths.

They compose

Multi-agent systems amplify risks. One agent's hallucination becomes another agent's grounded "fact." Inter-agent trust boundaries must be explicit.

!
"The model decided to" is not a defense.
Regulators and courts will hold the deploying organization accountable for agent actions. Architect for accountability from day one.

07 Agent governance controls

Concrete controls you can put in production today. Each control answers a specific failure mode.

ControlWhat it protects againstConcrete example
Tool permissions Agent calls dangerous tool unintentionally Deny-by-default tool list per agent role; allow: ["read_email"], never send_email for a triage agent.
Human approval gates Irreversible or high-stakes actions taken without review Block before wire_transfer(), delete_resource(), publish_to_prod() — require human click-through.
Audit logs No way to investigate incidents or prove compliance Append-only log: prompt, tools called, arguments, results, timestamps, identity. Immutable storage.
Memory governance Cross-tenant data leakage; memory poisoning Per-user memory namespaces; PII redaction on write; signed memory entries; TTL.
Policy engine Rule violations slip past the model OPA/Rego-style rules evaluated on every tool call: "no DB writes after 6pm in prod."
Prompt-injection defense Untrusted content hijacks the agent Treat retrieved docs as data, not instructions; structural separators; sandboxed tool execution.
Identity & RBAC Privilege escalation; impersonation Agent acts on behalf of a specific user; tool calls inherit that user's permissions, not the agent's.
Output validation Unsafe content reaches users or downstream systems Content classifier on output; schema validation; PII detector; refusal on policy violation.
Rate limiting & quotas Runaway loops; cost blowouts; abuse Per-user, per-tool, per-cost-budget caps with hard kill at threshold.
Sandbox / blast radius Bad action affects more than intended Code execution in containers; DB writes in staging copies; dry-run mode for destructive ops.
Eval & red-teaming Regressions and unknown failure modes Pre-deploy benchmarks for safety, accuracy, refusal; ongoing adversarial probing.
Kill switch Inability to stop a misbehaving agent fleet Centralized feature flag that disables all agent actions instantly; documented runbook.

08 Reference architecture

A typical enterprise AI governance architecture for an agentic system. Each layer enforces a different control surface — bypass any one and you lose accountability.

IdentityUser
ReasoningAI Agent
AuthorizationPolicy Engine
Content / PIISafety Layer
RBACTool Access Control
InferenceLLM
ObservabilityAudit Logs & Monitoring

Layer responsibilities

LayerResponsibilityFailure if absent
IdentityAuthenticate the requesting user; bind the session to a verified principal.Anonymous abuse; impersonation.
AI AgentPlan, choose tools, draft responses — but only propose actions, never execute directly.Runaway autonomy.
Policy EngineEvaluate proposed actions against rules: who, what, when, where, how much.Rule violations land in production.
Safety LayerFilter harmful content, redact PII, block jailbreak patterns, enforce brand-safe output.Toxic / leaking outputs reach users.
Tool Access ControlEnforce per-tool RBAC; mediate every external API call; redact secrets from logs.Privilege escalation.
LLMGenerate text / structured output. Stateless from a governance standpoint.
Audit & MonitoringTamper-evident log of every prompt, decision, tool call, output, and user.No way to investigate or prove compliance.

Sample policy rule

# Deny destructive DB operations in production
package agent.tools

default allow = false

allow {
  input.tool == "db_query"
  not contains(input.args.sql, "DELETE")
  not contains(input.args.sql, "DROP")
  input.env != "production"
}

allow {
  input.tool == "db_query"
  input.env == "production"
  input.args.sql == "SELECT"  # read-only in prod
  input.user.role in ["analyst", "engineer"]
}

09 Modern hot topics

Where the field is moving. Each of these is a live area of standards, tooling, and research investment.

Agent governance

Standards bodies (NIST, ISO) are drafting agent-specific risk profiles. Expect new requirements around tool authorization, autonomy levels, and inter-agent communication audit trails.

Autonomous AI control

Capability thresholds (Anthropic's ASL, OpenAI's preparedness framework) trigger mandatory mitigations as models grow more capable — security, deployment gates, deprecation playbooks.

AI observability

LLM-native tracing (OpenTelemetry GenAI semconv, Langfuse, Arize) — span every prompt, tool call, retrieval, evaluation. Without traces you cannot debug or audit.

Hallucination detection

Self-consistency, retrieval grounding scores, NLI-based fact checking, classifier guards. Detection at runtime, not just eval time.

AI auditability

Immutable, tamper-evident logs of every model decision — prompt, version, weights hash, retrieval context, output, downstream action.

Model lineage

Track the full provenance of a model: training data, fine-tunes, evaluations, deployment versions. Critical for incident response and regulator queries.

Synthetic data governance

Synthetic training data introduces privacy, bias, and copyright questions of its own. Governance must cover the generator and the generated data.

MCP / tool security

Model Context Protocol and similar tool-calling standards need authentication, capability scoping, and audit hooks — treat tool servers as trust boundaries.

Memory poisoning defense

Adversarial inputs that corrupt persistent agent memory. Mitigations: signed memory entries, anomaly detection, periodic memory audits, segregated namespaces.

Compliance automation

Continuous controls monitoring — evidence collection, drift detection, automated audit reports — moving compliance from annual exercise to real-time signal.

10 Implementation checklist

A pragmatic starting point. Don't try to ship all of it on day one — sequence by risk.

Foundations (week 1–4)

Hardening (month 2–3)

Maturity (month 4+)

Smallest possible MVP that's still credible:
identity propagation + per-tool allowlist + audit log + kill switch. Everything else is layered on top.

11 Glossary

Quick definitions for terms that recur across NIST, the EU AI Act, and RAI documentation.

TermDefinition
AI systemA machine-based system that, for explicit or implicit objectives, infers from input how to generate outputs (predictions, content, recommendations, decisions).
Conformity assessmentProcess to verify that a high-risk AI system meets EU AI Act requirements before being placed on the market.
Disparate impactWhen a seemingly neutral practice produces unequal outcomes across protected groups.
Foundation model / GPAILarge pre-trained model adaptable to many downstream tasks. EU AI Act calls these General-Purpose AI.
Human-in-the-loop (HITL)Architecture where a human reviews or approves AI outputs before they take effect.
Impact assessmentStructured analysis of potential harms a system may cause to individuals, groups, or society.
Model cardStandard documentation of a model's intended use, performance, limitations, and ethical considerations.
Prompt injectionAttack where adversarial content in inputs causes a model to ignore prior instructions or leak data.
Red-teamingAdversarial testing of an AI system to find safety, security, and policy failures before attackers do.
System cardDocumentation of a deployed system (model + scaffolding + policies), broader than a model card.
Tamper-evident logAn audit log designed so any modification or deletion is detectable (e.g. hash chains, append-only storage).

12 One-line summary

AI governance frameworks help ensure AI systems are safe, compliant, ethical, transparent, and controllable.
NIST gives you the vocabulary, the EU AI Act gives you the obligations, Responsible AI gives you the principles — and for agentic systems, governance must be enforced in code, not just on paper.