Evaluation & Safety

Generative AI Governance

Enterprise AI governance frameworks — policy creation, usage guidelines, risk assessment, compliance tracking, and responsible AI frameworks

Published: 2026-04-22 · Last updated: 2026-04-13

Generative AI Governance

As generative AI moves from experimental to production-critical, organizations need structured governance frameworks to ensure responsible, compliant, and effective use. This guide provides practical frameworks for creating AI governance programs that balance innovation with risk management.

Why Governance Matters

Risk Category	Example	Business Impact
Regulatory	EU AI Act non-compliance	Fines up to 7% of global revenue
Data Privacy	PII leakage in prompts	GDPR fines, reputational damage
IP Risk	Training on copyrighted material	Litigation, injunctions
Security	Prompt injection attacks	Data breaches, unauthorized access
Reputational	Biased or harmful outputs	Brand damage, customer loss
Operational	Unvetted model causing errors	Financial loss, incorrect decisions

Governance Framework Structure

A comprehensive AI governance program has five layers:

┌─────────────────────────────────────┐
│     Layer 1: Policy & Principles    │  What we believe and commit to
├─────────────────────────────────────┤
│     Layer 2: Usage Guidelines       │  What employees can and cannot do
├─────────────────────────────────────┤
│     Layer 3: Risk Classification    │  How we categorize AI use cases
├─────────────────────────────────────┤
│     Layer 4: Technical Controls     │  What systems enforce the rules
├─────────────────────────────────────┤
│     Layer 5: Monitoring & Audit     │  How we verify compliance
└─────────────────────────────────────┘

Layer 1: Policy & Principles

Core AI Principles

# AI Usage Principles

## 1. Human Oversight
All AI-generated outputs that impact users, customers, or business decisions
must have meaningful human review before deployment.

## 2. Transparency
We disclose when users are interacting with AI systems. We do not present
AI-generated content as human-created without disclosure.

## 3. Privacy Protection
No personally identifiable information (PII), confidential business data,
or sensitive employee data may be sent to external AI providers without
explicit approval from the Data Protection Officer.

## 4. Fairness
AI systems must be tested for demographic bias before deployment and
monitored for bias drift in production.

## 5. Accountability
Each AI system has a named owner who is responsible for its behavior,
outputs, and compliance with these principles.

## 6. Security
AI systems must undergo security review including prompt injection testing
before production deployment.

## 7. Auditability
All AI system decisions, prompts, and outputs must be logged for
audit and retrospective analysis.

Layer 2: Usage Guidelines

Approved Use Cases Matrix

Use Case	Approval Level	Data Classification	Review Required
Internal drafting (emails, docs)	Self-service	Public only	None
Code generation assistance	Team lead	Public + Internal	Quarterly review
Customer-facing content	Manager + Legal	Public only	Per-use review
Data analysis & summarization	Manager	Public + Internal	Monthly audit
Decision support (hiring, lending)	Director + Compliance	None allowed	Per-deployment audit
Automated decision-making	VP + Legal + Board	None allowed	Continuous monitoring
Creative content generation	Marketing lead	Public only	Human review required
Code review & security analysis	Security team	Internal only	None (internal tool)

Prompt Data Classification

from enum import Enum
from dataclasses import dataclass

class DataClassification(Enum):
    PUBLIC = "public"
    INTERNAL = "internal"
    CONFIDENTIAL = "confidential"
    RESTRICTED = "restricted"


@dataclass
class PromptComplianceCheck:
    allowed_classifications: list[DataClassification]
    blocked_patterns: list[str]
    requires_approval: bool

    def check(self, prompt_text: str, data_class: DataClassification) -> dict:
        result = {
            "allowed": True,
            "violations": [],
            "requires_approval": self.requires_approval,
        }

        # Check data classification
        if data_class not in self.allowed_classifications:
            result["allowed"] = False
            result["violations"].append(
                f"Data classification '{data_class.value}' not allowed for this AI tool"
            )

        # Check for blocked patterns
        import re
        for pattern in self.blocked_patterns:
            if re.search(pattern, prompt_text, re.IGNORECASE):
                result["allowed"] = False
                result["violations"].append(f"Blocked pattern detected: {pattern}")

        return result


# Default policy for employee self-service AI tools
SELF_SERVICE_POLICY = PromptComplianceCheck(
    allowed_classifications=[DataClassification.PUBLIC],
    blocked_patterns=[
        r"password|secret|api[_-]?key|token",
        r"social[_\s]?security|ssn",
        r"credit[_\s]?card",
        r"salary|compensation|payroll",
        r"patient|medical.*record",
    ],
    requires_approval=False,
)

Layer 3: Risk Classification

AI Risk Tiers

Tier	Risk Level	Description	Examples	Requirements
Tier 1	Minimal	Informational only, no impact	Internal summarization, brainstorming	Self-service, basic logging
Tier 2	Limited	Supports but doesn't drive decisions	Draft generation, code suggestions	Manager approval, bias testing
Tier 3	High	Influences decisions with user impact	Customer chatbots, content publishing	Legal review, red teaming
Tier 4	Critical	Automated decisions affecting people	Hiring screening, credit decisions	Board approval, continuous monitoring

Risk Assessment Questionnaire

RISK_ASSESSMENT_QUESTIONS = [
    {
        "id": "Q1",
        "question": "Does the AI system interact directly with end users?",
        "weights": {"yes": 2, "no": 0},
    },
    {
        "id": "Q2",
        "question": "Can the AI system's outputs cause financial harm if incorrect?",
        "weights": {"yes": 3, "no": 0},
    },
    {
        "id": "Q3",
        "question": "Does the system process personal or sensitive data?",
        "weights": {"yes": 3, "no": 0},
    },
    {
        "id": "Q4",
        "question": "Are the AI's outputs used for decisions about individuals?",
        "weights": {"yes": 4, "no": 0},
    },
    {
        "id": "Q5",
        "question": "Is there a human in the loop reviewing outputs before action?",
        "weights": {"yes": 0, "no": 3},
    },
    {
        "id": "Q6",
        "question": "Can the system be influenced by adversarial inputs?",
        "weights": {"yes": 2, "no": 0},
    },
    {
        "id": "Q7",
        "question": "Is the system used in a regulated domain (healthcare, finance, etc.)?",
        "weights": {"yes": 3, "no": 0},
    },
]

def calculate_risk_tier(answers: dict[str, str]) -> tuple[str, int]:
    """Calculate the risk tier based on assessment answers."""
    total_score = 0
    for question in RISK_ASSESSMENT_QUESTIONS:
        answer = answers.get(question["id"], "no")
        total_score += question["weights"].get(answer, 0)

    if total_score <= 2:
        return "Tier 1 (Minimal)", total_score
    elif total_score <= 5:
        return "Tier 2 (Limited)", total_score
    elif total_score <= 9:
        return "Tier 3 (High)", total_score
    else:
        return "Tier 4 (Critical)", total_score

Layer 4: Technical Controls

Prompt Filtering and Guardrails

class PromptGuardrail:
    """Technical enforcement of governance policies."""

    def __init__(self):
        self.pii_detector = PIIDetector()
        self.toxicity_detector = ToxicityDetector()
        self.injection_detector = InjectionDetector()

    async def check_prompt(self, prompt: str, user_id: str) -> dict:
        """Run all guardrails on a prompt before it reaches the LLM."""
        checks = {
            "pii": await self._check_pii(prompt),
            "toxicity": await self._check_toxicity(prompt),
            "injection": await self._check_injection(prompt),
            "rate_limit": await self._check_rate_limit(user_id),
            "allowed_model": await self._check_model_access(user_id),
        }

        violations = [k for k, v in checks.items() if not v["passed"]]

        return {
            "allowed": len(violations) == 0,
            "checks": checks,
            "violations": violations,
            "user_id": user_id,
            "timestamp": datetime.utcnow().isoformat(),
        }

    async def _check_pii(self, prompt: str) -> dict:
        detected = self.pii_detector.find(prompt)
        return {
            "passed": len(detected) == 0,
            "detected": detected,
            "action": "block_and_redact" if detected else "allow",
        }

    async def _check_injection(self, prompt: str) -> dict:
        risk_score = self.injection_detector.score(prompt)
        return {
            "passed": risk_score < 0.7,
            "risk_score": risk_score,
            "action": "block" if risk_score >= 0.7 else "allow",
        }

Output Validation

class OutputGuardrail:
    """Validate LLM outputs before delivering them to users."""

    def __init__(self):
        self.hallucination_detector = HallucinationDetector()
        self.fact_checker = FactChecker(knowledge_base="company_docs")
        self.compliance_checker = ComplianceChecker()

    async def validate_output(
        self, prompt: str, output: str, context: dict = None
    ) -> dict:
        validation = {
            "hallucination_risk": await self._check_hallucination(prompt, output),
            "factual_accuracy": await self._check_facts(output),
            "compliance": await self._check_compliance(output),
            "safety": await self._check_safety(output),
        }

        risk_level = self._compute_risk_level(validation)
        validation["risk_level"] = risk_level
        validation["action"] = self._determine_action(risk_level)

        return validation

    def _compute_risk_level(self, validation: dict) -> str:
        scores = {
            "hallucination_risk": validation["hallucination_risk"]["score"],
            "factual_accuracy": 1 - validation["factual_accuracy"]["accuracy_score"],
            "compliance": 1 - validation["compliance"]["compliance_score"],
            "safety": 1 - validation["safety"]["safety_score"],
        }

        max_score = max(scores.values())
        avg_score = sum(scores.values()) / len(scores)

        if max_score > 0.8:
            return "critical"
        elif max_score > 0.6 or avg_score > 0.4:
            return "high"
        elif max_score > 0.4 or avg_score > 0.25:
            return "medium"
        else:
            return "low"

    def _determine_action(self, risk_level: str) -> str:
        actions = {
            "critical": "block_and_escalate",
            "high": "require_human_review",
            "medium": "flag_for_review",
            "low": "allow",
        }
        return actions[risk_level]

Layer 5: Monitoring & Audit

Compliance Dashboard

class GovernanceDashboard:
    """Track governance compliance across all AI systems."""

    def __init__(self, audit_log: AuditLog):
        self.audit_log = audit_log

    async def generate_report(self, period: str = "last_30_days") -> dict:
        logs = await self.audit_log.query(period)

        return {
            "summary": {
                "total_ai_requests": len(logs),
                "total_violations": len([l for l in logs if l.violation]),
                "violation_rate": len([l for l in logs if l.violation]) / len(logs),
                "blocked_prompts": len([l for l in logs if l.action == "blocked"]),
                "human_reviews_triggered": len([l for l in logs if l.action == "human_review"]),
            },
            "by_system": self._group_by_system(logs),
            "by_risk_tier": self._group_by_risk_tier(logs),
            "top_violations": self._top_violations(logs),
            "trend": self._compliance_trend(logs),
            "recommendations": self._generate_recommendations(logs),
        }

    def _top_violations(self, logs: list) -> list[dict]:
        from collections import Counter
        violation_types = Counter(l.violation_type for l in logs if l.violation)
        return [
            {"type": vtype, "count": count, "percentage": count / len(logs) * 100}
            for vtype, count in violation_types.most_common(10)
        ]

    def _generate_recommendations(self, logs: list) -> list[str]:
        recommendations = []

        # High violation rate
        violation_rate = sum(1 for l in logs if l.violation) / len(logs)
        if violation_rate > 0.05:
            recommendations.append(
                "Violation rate exceeds 5%. Review training materials and access controls."
            )

        # Repeated offenders
        user_violations = Counter(l.user_id for l in logs if l.violation)
        repeat_offenders = [u for u, c in user_violations.items() if c > 5]
        if repeat_offenders:
            recommendations.append(
                f"{len(repeat_offenders)} users have 5+ violations. Schedule targeted training."
            )

        # System-specific issues
        system_violations = Counter(l.system_name for l in logs if l.violation)
        for system, count in system_violations.items():
            if count > 100:
                recommendations.append(
                    f"System '{system}' has {count} violations. Conduct a full audit."
                )

        return recommendations

Audit Trail

Every AI interaction should be logged for compliance:

@dataclass
class AuditEntry:
    timestamp: str
    user_id: str
    system_name: str
    risk_tier: str
    prompt_hash: str  # Hash for privacy, raw text stored separately with access controls
    model_used: str
    output_hash: str
    tokens_used: int
    cost_usd: float
    guardrail_results: dict
    action_taken: str  # allowed, blocked, flagged, human_review
    human_reviewer: str | None
    human_review_decision: str | None
    compliance_status: str  # compliant, violation, pending_review


class AuditLog:
    def __init__(self, database):
        self.db = database

    async def log(self, entry: AuditEntry):
        await self.db.insert("ai_audit_log", entry)

    async def query(self, period: str, filters: dict = None) -> list[AuditEntry]:
        query = f"SELECT * FROM ai_audit_log WHERE timestamp >= {period}"
        if filters:
            for key, value in filters.items():
                query += f" AND {key} = '{value}'"
        return await self.db.execute(query)

Regulatory Compliance Mapping

Regulation	Key Requirement	How to Comply
EU AI Act	Risk classification, transparency, human oversight	Implement risk tiers, disclose AI use, maintain human review
GDPR	Data minimization, right to explanation	Log data flows, provide output explanations, enable data deletion
SOC 2	Access controls, monitoring, audit trails	Implement guardrails, maintain audit logs, regular reviews
HIPAA (healthcare)	PHI protection	Block PHI in prompts, use HIPAA-compliant providers, BAAs
CCPA (California)	Consumer rights, transparency	Disclose AI use in consumer interactions, honor opt-out
NYC Bias Law	Bias auditing for employment AI	Regular bias testing, impact assessments for hiring tools

Building a Governance Program

Phase 1: Foundation (Weeks 1-4)

Appoint an AI Governance Lead (or committee)
Draft core AI principles and usage policy
Inventory all existing AI tool usage
Identify highest-risk use cases

Phase 2: Controls (Weeks 5-12)

Deploy prompt/output guardrails
Implement audit logging
Create risk assessment process
Launch employee AI training

Phase 3: Maturation (Weeks 13-24)

Establish continuous monitoring
Build compliance dashboards
Conduct first AI risk audit
Create incident response procedures

Phase 4: Optimization (Ongoing)

Regular policy reviews and updates
Automated compliance testing
Industry benchmarking
Regulatory change monitoring

Cross-References

For security-specific guidance, see LLM Security Best Practices
For prompt injection testing, see Prompt Security Testing
For evaluating model bias, see LLM Bias Mitigation
For adversarial testing, see AI Safety & Red Teaming
For monitoring production systems, see LLM Observability & Monitoring

Related docs

LLM Security Best Practices

Securing LLM applications — API key management, prompt injection defense, data privacy, supply chain security, and compliance frameworks

Data Platform Reviewer Agent Implementation Guide

Architecture, workflow design, metrics, and rollout guidance for a data platform reviewer agent in production.

Developer Productivity Reviewer Agent Implementation Guide

Architecture, workflow design, metrics, and rollout guidance for a developer productivity reviewer agent in production.

Related agents

Data Platform Reviewer Agent

Data Platform agent blueprint focused on inspect drafts, tool outputs, or decisions for gaps, policy issues, and missing evidence before work moves forward for analysts and engineers need better query generation, pipeline debugging, and dataset explanation across changing schemas.

Developer Productivity Reviewer Agent

Developer Productivity agent blueprint focused on inspect drafts, tool outputs, or decisions for gaps, policy issues, and missing evidence before work moves forward for engineering teams want reliable help with issue triage, runbook guidance, and change review without obscuring system ownership.

Finance Operations Reviewer Agent

Finance Operations agent blueprint focused on inspect drafts, tool outputs, or decisions for gaps, policy issues, and missing evidence before work moves forward for finance teams need faster reconciliation, exception review, and policy-aware reporting for recurring operational workflows.