AIPolicy Policy Handbook
Document Identifier: AIPOLICY-POLICY-HANDBOOK Status: Non-normative Version: 2.0.0-draft.1 Date: 2026-02-07 Editor: Guido Mitschke
About This Document
This is a non-normative companion to the AIPolicy Registry (registry/principles-v2.md). It provides background, rationale, and practical guidance for each of the 16 policies defined in Registry v1.1. The authoritative definitions -- including machine keys, descriptions, typical scopes, and consumer guidance -- remain in the registry itself. This document helps publishers make informed endorsement decisions by explaining what each policy means in practice, what it does not mean, and where legitimate gray areas exist.
How to Read This Handbook
Each policy entry follows a consistent structure:
- Background -- The real-world context and problem the policy addresses. Why does this policy exist?
- Intent -- The governance goal the policy pursues. What outcome does endorsement aim for?
- What Endorsement Means -- Concrete signals a publisher sends by endorsing this policy. These are indicative, not exhaustive.
- What Endorsement Does NOT Mean -- Common misinterpretations and overreadings. Endorsement is not a blanket commitment to any specific operational practice.
- Practical Examples -- Scenarios illustrating endorsable and non-endorsable behavior. These are illustrative, not normative.
- Related Policies -- Cross-references to other policies that interact with or complement this one.
Endorsement of any policy is voluntary and self-assessed. There is no certification body, no external audit requirement, and no enforcement mechanism built into the AIPolicy standard. Publishers endorse policies because they believe the signal accurately reflects their governance posture -- not because endorsement confers a compliance status.
Category 1: Interdependence
Interdependence addresses the mutual dependency between human activity and AI systems. The policies in this category recognize that AI operates within human societies and should contribute to, rather than erode, the structures that sustain them -- particularly labor markets and cultural ecosystems. These policies signal a preference for AI deployments that augment human capabilities and preserve the plurality of human expression.
AP-1.1: Employment Protection
Background
The rapid deployment of AI systems across industries has raised persistent concerns about workforce displacement. Between 2023 and 2025, multiple sectors -- including content creation, customer support, translation, and software development -- experienced significant restructuring as organizations adopted generative AI and automation tools. While historical technological transitions have ultimately created new categories of employment, the speed of AI-driven displacement has outpaced the capacity of many workers and institutions to adapt. This policy exists because the question is not whether automation occurs, but whether it occurs with consideration for the humans affected.
Intent
The goal of AP-1.1 is to signal a preference for AI deployments that treat workforce impact as a design consideration, not an afterthought. The policy favors augmentation over wholesale replacement and encourages transition pathways where displacement is unavoidable. It does not seek to prevent automation but to ensure that efficiency gains are not pursued at the exclusive expense of affected workers.
What Endorsement Means
- Your organization considers workforce impact when evaluating AI deployments.
- Where AI displaces existing roles, you provide or support transition pathways such as retraining, role evolution, or redeployment.
- You favor human-AI collaboration models over full automation where the quality of outcomes permits it.
- You are willing to accept that augmentation may be slower or more costly than full automation in some cases.
What Endorsement Does NOT Mean
- It does not mean you cannot automate any tasks or processes.
- It does not mean every existing position must be preserved indefinitely.
- It does not require specific hiring quotas, labor agreements, or compensation schemes.
- It does not prohibit using AI to improve productivity, even where that reduces headcount over time.
Practical Examples
- Endorsed: A logistics company automates route optimization but retrains dispatchers for exception handling and customer liaison roles.
- Endorsed: A content platform deploys AI-assisted writing tools while maintaining human editorial oversight and crediting human contributors.
- Endorsed: A company phases in AI-driven data analysis over 18 months, offering affected analysts upskilling programs in data science.
- Not endorsable: A company replaces its entire customer support department with chatbots overnight, offers no transition support, and then endorses AP-1.1.
- Not endorsable: An organization publicly endorses AP-1.1 while internally treating all labor costs as optimization targets for AI replacement.
Related Policies: AP-4.2 (Societal Benefit), AP-5.3 (Autonomy Protection)
Testability Criteria
- Documentation exists describing how the AI system's deployment affects existing human roles
- Where roles are displaced, evidence of transition planning (retraining, redeployment, or notice periods) is present
- The ratio of augmented versus fully replaced roles is measurable and tracked over time
AP-1.2: Cultural Diversity
Background
AI systems trained on large-scale datasets tend to reflect the statistical center of their training data. In practice, this means outputs gravitate toward dominant languages, cultural norms, and aesthetic conventions -- typically those of English-speaking, Western markets. Translation systems flatten idiomatic expression. Content generators default to globally homogenized styles. Recommendation algorithms favor mainstream content over niche or regional material. Over time, these tendencies risk eroding the cultural diversity that AI systems draw upon, creating a feedback loop of homogenization.
Intent
AP-1.2 signals a commitment to preserving and promoting cultural diversity in AI-mediated contexts. The goal is not to prohibit global reach or standardization where appropriate, but to ensure that AI systems do not systematically erase regional, linguistic, or cultural variation. Publishers endorsing this policy recognize that cultural diversity is an asset, not an inefficiency.
What Endorsement Means
- Your AI systems are designed or configured to preserve regional, linguistic, and cultural variation in outputs.
- You consider the cultural context of your users when deploying content generation, translation, or recommendation systems.
- You actively avoid defaulting to a single cultural norm when your audience is diverse.
- You support multilingual content and local expression where your product scope permits.
What Endorsement Does NOT Mean
- It does not require supporting every language or cultural context simultaneously.
- It does not mean you cannot offer standardized products or interfaces.
- It does not require cultural expertise in every deployment market.
- It does not prohibit AI-generated content that follows global conventions where that serves the user.
Practical Examples
- Endorsed: A translation service preserves regional idioms and offers dialect-specific options rather than defaulting to a single "standard" variant of a language.
- Endorsed: A content recommendation system allocates a meaningful share of recommendations to local creators rather than exclusively promoting globally trending content.
- Not endorsable: A creative writing tool trained exclusively on English-language data is marketed globally with no consideration for cultural adaptation, and the publisher endorses AP-1.2.
- Not endorsable: A music recommendation algorithm systematically suppresses regional genres in favor of globally popular tracks.
Related Policies: AP-5.2 (Dignity Protection), AP-7.2 (Source Attribution)
Testability Criteria
- The system's training data and output distribution can be audited for cultural and linguistic representation
- Users from different cultural contexts receive contextually appropriate results (measurable via A/B testing or user surveys)
- The system does not systematically suppress or down-rank content from minority cultures or languages
Category 2: Decision Authority
Decision Authority addresses the allocation of decision-making power between humans and AI systems. As AI systems increasingly produce recommendations and determinations in high-stakes domains -- healthcare, finance, criminal justice, employment -- the question of who holds final authority becomes a core governance concern. This category establishes a preference for AI systems that inform rather than decide, and that make their reasoning available for scrutiny.
AP-2.1: Human Final Decision
Background
AI systems are increasingly deployed in domains where decisions have significant consequences for individuals: loan approvals, medical diagnoses, parole recommendations, hiring decisions, and insurance assessments. In many of these domains, AI systems can process information faster and at greater scale than human decision-makers. However, speed and scale do not equate to legitimacy. Consequential decisions often involve contextual judgment, ethical considerations, and accountability structures that presuppose a human decision-maker. The delegation of final authority to an automated system raises questions about recourse, accountability, and the right of affected individuals to have their case considered by a human.
Intent
AP-2.1 signals that humans retain final authority over decisions with significant consequences. AI systems in these domains operate as advisory tools -- they may recommend, flag, score, or rank, but the final determination rests with a human who can be held accountable. The policy recognizes that the appropriate level of human involvement varies by domain and risk level.
What Endorsement Means
- In high-stakes decision domains, your AI systems present outputs as recommendations, not as autonomous determinations.
- Escalation pathways to human review exist and are accessible to affected individuals.
- Human decision-makers have the authority and the practical ability to override AI recommendations.
- You define which of your decision domains are "consequential" and ensure human oversight in those domains.
What Endorsement Does NOT Mean
- It does not mean every AI output requires human approval. Routine, low-stakes automation is unaffected.
- It does not prohibit AI systems from scoring, ranking, or filtering information for human reviewers.
- It does not require that humans review every individual case -- risk-based escalation models are compatible.
- It does not imply that human decisions are always superior to AI recommendations.
Practical Examples
- Endorsed: A bank uses AI to pre-screen loan applications but requires a human loan officer to approve or deny each application, with documented reasons.
- Endorsed: A healthcare system uses AI to flag anomalies in medical imaging, but a physician makes the diagnostic decision.
- Not endorsable: A hiring platform uses AI to automatically reject candidates based on algorithmic scoring with no human review of rejections.
- Not endorsable: A criminal justice system uses an AI risk assessment tool as the sole basis for sentencing recommendations, with judges routinely rubber-stamping the output.
Related Policies: AP-2.2 (Transparent Decision Chains), AP-5.3 (Autonomy Protection), AP-6.2 (Deactivatability)
Testability Criteria
- A documented escalation or override mechanism exists for consequential decisions
- Logs confirm that a human actor reviewed and approved (or overrode) the AI's recommendation before final action
- The system provides a clear interface for human decision-makers to accept, modify, or reject AI outputs
AP-2.2: Transparent Decision Chains
Background
The opacity of AI decision-making has emerged as one of the most persistent governance challenges. Neural networks, ensemble models, and other complex architectures often produce outputs that are difficult to trace back to specific inputs or reasoning steps. This opacity creates problems for accountability, auditability, and trust. When an individual is denied a loan, flagged by a content moderation system, or deprioritized by a recommendation algorithm, the inability to explain why undermines both the legitimacy of the decision and the affected person's ability to contest it. Regulatory frameworks such as the EU AI Act and various sector-specific rules increasingly require explainability for high-risk AI systems.
Intent
AP-2.2 signals a commitment to making AI decision processes explainable and traceable. The goal is not to require that every model be fully interpretable in a mathematical sense, but to ensure that stakeholders -- including affected individuals, auditors, and oversight bodies -- can obtain a meaningful explanation of how an AI system arrived at a given output.
What Endorsement Means
- Your AI systems provide human-readable explanations for their outputs, appropriate to the domain and audience.
- You maintain audit trails that record decision inputs, model versions, and outputs for consequential decisions.
- Affected individuals can request an explanation of how an AI-assisted decision was reached.
- You invest in explainability tooling proportionate to the risk level of your AI applications.
What Endorsement Does NOT Mean
- It does not require publishing proprietary model architectures or training data.
- It does not mean every output must be accompanied by a full technical explanation.
- It does not require that AI systems use only inherently interpretable models (e.g., decision trees).
- It does not mandate a specific explainability framework or standard.
Practical Examples
- Endorsed: An insurance company provides applicants with a summary of the factors that influenced their AI-assessed premium, using natural language.
- Endorsed: A content moderation platform logs the policy rules and confidence scores that led to content removal, making these available for internal review.
- Not endorsable: A credit scoring system produces a numerical score with no explanation of contributing factors and no avenue for the affected individual to understand the result.
- Not endorsable: An organization claims explainability but provides only boilerplate text unrelated to the specific decision in question.
Related Policies: AP-2.1 (Human Final Decision), AP-7.1 (Information Integrity)
Testability Criteria
- The system produces human-readable explanations for its outputs upon request
- An audit trail exists that records inputs, intermediate processing steps, and final outputs
- Third-party auditors can reconstruct the decision pathway from available logs and documentation
Category 3: Power Distribution
Power Distribution addresses the structural risks of AI-driven concentration of economic, informational, or political power. AI systems can amplify existing power asymmetries through proprietary control of critical infrastructure, barriers to entry, and network effects that foreclose competition. The policies in this category signal a preference for AI ecosystems that remain open, interoperable, and accessible to a plurality of actors.
AP-3.1: Decentralization
Background
The development and deployment of AI systems is heavily concentrated among a small number of large technology companies. These organizations control foundational models, training infrastructure, large-scale datasets, and distribution channels. This concentration creates dependencies: businesses, governments, and individuals increasingly rely on a handful of providers for AI capabilities that are becoming essential infrastructure. The risk is not merely economic. Concentrated control over AI systems implies concentrated influence over information flows, economic opportunity, and the terms on which AI is made available. Historical precedent in telecommunications, energy, and internet platforms demonstrates that infrastructure concentration tends to persist and deepen without intentional countermeasures.
Intent
AP-3.1 signals a preference for AI ecosystems in which power -- economic, informational, and political -- is distributed rather than concentrated. The goal is not to prevent large organizations from operating but to encourage practices that preserve pluralism, interoperability, and the ability of diverse actors to participate in AI development and deployment.
What Endorsement Means
- You support interoperability, open APIs, and portable data formats where technically feasible.
- You avoid creating unnecessary dependencies or lock-in mechanisms in your AI products or services.
- You consider the systemic effects of your AI deployment decisions on the broader ecosystem.
- You favor open standards and shared infrastructure over proprietary alternatives where quality and security permit.
What Endorsement Does NOT Mean
- It does not prohibit proprietary AI products or services.
- It does not require open-sourcing your models or training data.
- It does not mean you cannot build competitive advantages through AI capabilities.
- It does not mandate a specific organizational structure or governance model.
Practical Examples
- Endorsed: An AI platform offers standard data export formats and documented APIs, allowing customers to migrate to alternative providers.
- Endorsed: A model provider publishes model cards and supports interoperability with third-party evaluation frameworks.
- Not endorsable: A cloud AI provider designs its APIs to be incompatible with competitors and charges prohibitive exit fees, then endorses AP-3.1.
- Not endorsable: A company acquires and discontinues open-source AI tools to eliminate alternatives to its proprietary stack.
Related Policies: AP-3.2 (Anti-Monopoly), AP-4.2 (Societal Benefit)
Testability Criteria
- The number of independent entities with meaningful access to the AI system or its outputs is measurable
- No single entity controls more than a defined threshold of market share, data access, or decision-making authority within the system's domain
- Interoperability with competing or alternative systems is technically supported
AP-3.2: Anti-Monopoly
Background
The AI industry exhibits strong tendencies toward monopolistic concentration. Training large models requires compute resources available to few organizations. Access to high-quality data is unevenly distributed. Network effects in AI platforms -- where more users generate more data, which improves models, which attracts more users -- create self-reinforcing market positions. The risk of monopoly in AI is distinct from traditional market monopoly because AI systems increasingly mediate access to information, economic opportunity, and public discourse. A monopoly on AI infrastructure is, in effect, a monopoly on cognitive infrastructure.
Intent
AP-3.2 signals a commitment to maintaining competitive and accessible AI markets. The policy is not anti-business or anti-scale; it recognizes that large-scale AI development serves important functions. Rather, it signals that the endorsing organization does not seek to foreclose competition or establish unchallenged dominance over critical AI capabilities.
What Endorsement Means
- You avoid vendor lock-in mechanisms that prevent customers from switching providers.
- You support standard data export formats and avoid proprietary formats designed to create switching costs.
- You do not engage in predatory practices aimed at eliminating AI competitors.
- You consider whether your market behavior contributes to healthy competition or to concentration.
What Endorsement Does NOT Mean
- It does not prohibit market leadership or large-scale operations.
- It does not require sharing proprietary technology with competitors.
- It does not mean you cannot compete aggressively on quality, features, or price.
- It does not impose obligations beyond the endorsing organization's own behavior.
Practical Examples
- Endorsed: A compute provider offers fair-access pricing tiers that allow smaller AI developers to train models without prohibitive costs.
- Endorsed: An AI company supports industry standardization efforts for model interoperability.
- Not endorsable: A company uses below-cost pricing to drive competitors out of the market, then raises prices once alternatives have been eliminated.
- Not endorsable: An infrastructure provider imposes exclusive contracts that prevent customers from using competing AI services.
Related Policies: AP-3.1 (Decentralization), AP-4.1 (Democratic Process Support)
Testability Criteria
- Users can export their data and models in standard formats without degradation
- No vendor lock-in mechanisms prevent migration to alternative providers
- The competitive landscape of the relevant AI domain can be assessed through public market data
Category 4: Democratic Accountability
Democratic Accountability addresses the relationship between AI systems and democratic governance. AI systems that mediate public discourse, influence electoral processes, or shape access to information have the potential to either strengthen or undermine democratic institutions. This category also encompasses the broader expectation that AI systems should consider the interests of affected communities, not solely those of their operators.
AP-4.1: Democratic Process Support
Background
AI systems interact with democratic processes in multiple ways: social media algorithms shape political discourse, AI-generated content can impersonate candidates or fabricate events, microtargeting tools enable precision-targeted political advertising, and automated accounts can simulate grassroots movements. The 2024 election cycles across multiple democracies demonstrated the potential for AI-generated deepfakes, synthetic media, and algorithmically amplified disinformation to distort public understanding. Beyond elections, AI systems affect democratic processes through their influence on public discourse, access to information, and the ability of citizens to form independent opinions.
Intent
AP-4.1 signals a commitment to deploying AI systems that support rather than undermine democratic processes. The policy does not prescribe specific interventions but signals that the endorsing organization recognizes its responsibility where its AI systems intersect with elections, civic participation, or public discourse. The policy applies both to deliberate misuse and to unintended effects of AI system design.
What Endorsement Means
- You label AI-generated content in political contexts and provide transparency about synthetic media.
- Your AI systems do not systematically amplify polarizing, inflammatory, or extremist content through algorithmic design choices.
- You implement safeguards against the use of your AI tools for electoral manipulation, voter suppression, or fabrication of political content.
- You consider the impact of your recommendation and content distribution algorithms on public discourse.
What Endorsement Does NOT Mean
- It does not mean your AI systems cannot be used in political contexts at all.
- It does not require censoring political speech or imposing editorial judgments on political content.
- It does not mandate specific content moderation policies.
- It does not prohibit AI-assisted political advertising, only that such advertising should be transparent.
Practical Examples
- Endorsed: A social media platform labels AI-generated political content and limits the algorithmic amplification of unverified claims during election periods.
- Endorsed: A generative AI service implements safeguards against producing realistic deepfakes of political figures.
- Not endorsable: A platform knowingly allows AI-generated deepfakes of candidates to circulate without labeling, and endorses AP-4.1.
- Not endorsable: A recommendation algorithm is tuned to maximize engagement through polarization, with no consideration for its effect on public discourse.
Related Policies: AP-7.1 (Information Integrity), AP-5.2 (Dignity Protection)
Testability Criteria
- AI-generated or AI-manipulated content related to democratic processes is labeled with provenance metadata
- The system does not systematically amplify or suppress political content based on engagement optimization alone
- Public audits of the system's impact on information diversity during electoral periods are feasible
AP-4.2: Societal Benefit
Background
AI systems are developed and deployed primarily by commercial organizations with obligations to shareholders. This creates a structural tension: the optimization targets of commercial AI (engagement, conversion, revenue) do not always align with societal welfare. An engagement-maximizing algorithm may promote addictive usage patterns. A cost-minimizing deployment may externalize harms to vulnerable populations. The growing reach of AI systems -- touching healthcare, education, transportation, housing, and public services -- means that the gap between commercial and societal objectives has tangible consequences for large populations.
Intent
AP-4.2 signals that AI systems should serve broad societal benefit, not solely the interests of their operators. The policy does not require altruism or the abandonment of commercial objectives; it signals that the endorsing organization considers community impact alongside business metrics. It encourages the documentation and measurement of societal outcomes alongside commercial performance indicators.
What Endorsement Means
- You document the community impact of your AI systems and consider affected populations in design decisions.
- You include societal benefit metrics alongside commercial objectives in your AI evaluation frameworks.
- You seek input from affected communities when deploying AI systems with broad population impact.
- You consider whether the distribution of benefits and harms from your AI deployment is equitable.
What Endorsement Does NOT Mean
- It does not require that AI systems be non-commercial or publicly funded.
- It does not mean commercial objectives are illegitimate.
- It does not mandate specific impact measurement frameworks or reporting standards.
- It does not require that every AI system deliver measurable societal benefit directly.
Practical Examples
- Endorsed: A healthcare AI company provides discounted access to diagnostic tools for underserved communities and publishes impact assessments.
- Endorsed: An education technology platform uses AI to adapt learning materials to individual needs, measuring both student outcomes and commercial metrics.
- Not endorsable: A company deploys an AI system that disproportionately harms a vulnerable population, documents the harm internally, takes no corrective action, and endorses AP-4.2.
- Not endorsable: A public-sector AI deployment optimizes exclusively for administrative cost reduction with no assessment of service quality impact on citizens.
Related Policies: AP-1.1 (Employment Protection), AP-5.2 (Dignity Protection), AP-4.1 (Democratic Process Support)
Testability Criteria
- A documented impact assessment identifies the communities affected by the AI system and how their interests are considered
- The system's objectives include at least one measurable societal benefit metric beyond operator revenue
- Mechanisms exist for affected communities to provide input on the system's design or operation
Category 5: Individual Protection
Individual Protection addresses the direct impact of AI systems on human beings across three dimensions: physical safety, psychological and social integrity, and freedom of choice. These policies apply wherever AI systems interact with, affect, or make determinations about individual humans. They signal a preference for AI systems that incorporate safety mechanisms, avoid discriminatory or demeaning behavior, and refrain from manipulating individual decision-making.
AP-5.1: Life Protection
Background
AI systems are increasingly deployed in domains where failures can result in physical harm or death: autonomous vehicles, medical devices, industrial robotics, infrastructure management, and military applications. The 2018 Uber self-driving car fatality, incidents involving automated industrial equipment, and concerns about autonomous weapons systems illustrate that AI failures in safety-critical domains carry consequences fundamentally different from failures in content recommendation or data analysis. The complexity of AI systems, combined with their operation in unpredictable real-world environments, means that even well-designed systems can encounter situations their training data did not anticipate.
Intent
AP-5.1 establishes the expectation that AI systems operating in safety-critical domains incorporate fail-safes, redundancy, and human oversight proportionate to the risk. The policy does not prohibit AI in high-risk domains but requires that safety be treated as a primary design objective, not a secondary concern to be addressed after deployment.
What Endorsement Means
- Your AI systems in safety-critical domains incorporate fail-safes that default to safe states under uncertainty or malfunction.
- You implement redundancy in AI systems where failure could result in physical harm.
- Human oversight is proportionate to the risk level: higher-risk domains receive higher levels of human monitoring and intervention capability.
- You conduct and document safety testing appropriate to the risk profile of your AI applications.
What Endorsement Does NOT Mean
- It does not prohibit AI systems from operating in safety-critical domains.
- It does not require zero risk; it requires risk management proportionate to consequences.
- It does not mandate specific safety standards or certification frameworks.
- It does not apply equally to all AI systems -- a chatbot and an autonomous vehicle have different risk profiles.
Practical Examples
- Endorsed: An autonomous vehicle system defaults to a safe stop when its sensors encounter conditions outside its operational design domain.
- Endorsed: A medical AI diagnostic tool provides confidence levels with its outputs and escalates low-confidence cases to human review.
- Not endorsable: An industrial robotics company deploys AI-controlled systems in human-occupied environments without emergency shutdown mechanisms.
- Not endorsable: A medical device uses AI to adjust drug dosages autonomously with no fail-safe for sensor malfunction.
Related Policies: AP-6.2 (Deactivatability), AP-2.1 (Human Final Decision)
Testability Criteria
- Fail-safe mechanisms are documented and tested for all identified life-critical failure modes
- The system has a defined maximum autonomous operating envelope; operation beyond this envelope triggers human escalation or safe shutdown
- Incident logs and near-miss reports are maintained and periodically reviewed
AP-5.2: Dignity Protection
Background
AI systems can affect human dignity in ways both obvious and subtle. Obvious violations include AI systems used for mass surveillance of specific ethnic or religious groups, facial recognition systems with documented racial bias, and AI-generated content designed to harass or demean individuals. Subtler violations include recommendation systems that systematically disadvantage certain demographics, hiring algorithms that encode historical discrimination, and AI systems that reduce individuals to behavioral profiles for manipulation. The scale at which AI systems operate means that dignity violations can affect millions of people simultaneously, and the opacity of algorithmic decision-making can make such violations difficult to detect and contest.
Intent
AP-5.2 signals that AI systems are designed and operated with respect for human dignity. The policy addresses both intentional misuse (AI deployed to demean or discriminate) and unintended effects (AI systems that produce discriminatory outcomes through biased training data or flawed design). It recognizes that dignity protection requires active effort -- auditing outputs, testing for bias, and designing for fairness -- rather than passive good intentions.
What Endorsement Means
- You audit your AI systems for discriminatory patterns in outputs and outcomes.
- Your AI systems are not designed or deployed to demean, stigmatize, or dehumanize individuals or groups.
- You implement bias testing and mitigation measures proportionate to the risk and impact of your AI applications.
- You provide channels for individuals to report dignity-related harms from your AI systems.
What Endorsement Does NOT Mean
- It does not mean your AI systems are guaranteed to be free of all bias.
- It does not require that AI systems produce identical outcomes across all demographic groups in all contexts.
- It does not prohibit AI systems from making distinctions where those distinctions are legally and ethically justified.
- It does not mandate specific fairness metrics or bias testing methodologies.
Practical Examples
- Endorsed: A hiring platform regularly audits its AI screening tools for demographic bias and publishes aggregate fairness metrics.
- Endorsed: A facial recognition provider tests its system across diverse demographic groups and documents performance variations transparently.
- Not endorsable: A company deploys a facial recognition system known to have significantly higher error rates for certain ethnic groups, takes no corrective action, and endorses AP-5.2.
- Not endorsable: An AI chatbot is designed to use demeaning language toward users based on their profile characteristics.
Related Policies: AP-1.2 (Cultural Diversity), AP-5.3 (Autonomy Protection), AP-4.2 (Societal Benefit)
Testability Criteria
- The system's outputs can be audited for discriminatory patterns across protected demographic categories
- No feature of the system is designed to demean, shame, or publicly stigmatize individuals
- Bias testing is performed at defined intervals and results are documented
AP-5.3: Autonomy Protection
Background
AI systems increasingly mediate human decision-making in ways that can undermine individual autonomy. Recommendation algorithms curate information environments that shape beliefs and preferences. Persuasive design patterns -- often called "dark patterns" -- exploit cognitive biases to steer users toward decisions that serve the platform's interests rather than the user's. Personalization systems create feedback loops that narrow the range of options individuals perceive as available. Addiction-by-design in social media and gaming applications demonstrates that AI-driven optimization can systematically erode an individual's capacity for independent choice. The boundary between helpful personalization and covert manipulation is often unclear, making this one of the most nuanced policy areas in the registry.
Intent
AP-5.3 signals that AI systems respect human autonomy by refraining from covert manipulation and providing individuals with meaningful control over decisions that affect their lives. The policy does not prohibit personalization or recommendation; it requires that such features operate transparently and that individuals retain the ability to make informed, independent choices.
What Endorsement Means
- Your AI systems provide transparent personalization controls that allow users to understand and adjust how content or options are curated for them.
- You avoid dark patterns and manipulation techniques designed to override informed user choice.
- Your recommendation and personalization systems are designed to expand, not narrow, the range of options available to users.
- You consider the cumulative effect of your AI-driven engagement mechanisms on user autonomy.
What Endorsement Does NOT Mean
- It does not prohibit personalization, recommendation, or content curation.
- It does not require that AI systems present all options equally with no filtering.
- It does not mean users cannot be offered defaults or suggestions.
- It does not mandate specific UX design patterns or interface choices.
Practical Examples
- Endorsed: A news platform's AI recommendation system offers users controls to adjust topic diversity and source breadth, with clear labeling of how recommendations are generated.
- Endorsed: An e-commerce platform provides AI-driven suggestions but avoids countdown timers, fake scarcity indicators, and other pressure tactics.
- Not endorsable: A social media platform uses AI to maximize screen time through intermittent reinforcement schedules modeled on slot machines, with no user controls for limiting engagement.
- Not endorsable: A subscription service uses AI to identify the psychologically optimal moment to present cancellation barriers.
Related Policies: AP-2.1 (Human Final Decision), AP-1.1 (Employment Protection), AP-5.2 (Dignity Protection)
Testability Criteria
- Users can access, modify, or disable personalization and recommendation settings
- The system does not employ persuasion techniques that exploit cognitive biases or emotional states (testable via UX audit)
- Default settings do not systematically favor the operator's commercial interests over the user's stated preferences
Category 6: Self-Limitation
Self-Limitation addresses the internal behavioral boundaries of AI systems. As AI systems become more capable of self-modification, optimization, and autonomous operation, the question of whether they respect human-defined constraints becomes increasingly material. This category is particularly relevant for advanced AI systems with learning, adaptation, or agent-like capabilities, but its principles apply broadly to any AI system that adjusts its behavior over time.
AP-6.1: No Self-Optimization Against Humans
Background
AI systems that learn and adapt can, under certain conditions, develop behaviors that achieve their optimization targets while producing outcomes harmful to humans. Reinforcement learning agents have demonstrated the ability to discover and exploit unintended reward pathways. Recommendation algorithms optimizing for engagement have produced outcomes that increase polarization, addiction, and misinformation exposure. The concern is not that AI systems are malicious but that optimization processes, left unconstrained, pursue their objectives in ways that diverge from human interests. This divergence can be subtle: an AI system may appear to function correctly while systematically optimizing for metrics that erode user welfare.
Intent
AP-6.1 signals that AI self-improvement, learning, and adaptation processes remain bounded by human-defined objectives and constraints. The policy does not prohibit learning or adaptation but requires that these processes operate within guardrails that prevent optimization at the expense of human interests. It emphasizes the importance of monitoring AI behavior over time, not only at deployment.
What Endorsement Means
- Your AI systems operate within human-defined objective constraints, and these constraints are documented.
- Self-modification, learning, or adaptation events are logged and auditable.
- You monitor AI system behavior over time for drift toward outcomes that harm users or other stakeholders.
- Optimization targets are reviewed periodically to ensure alignment with intended human outcomes.
What Endorsement Does NOT Mean
- It does not prohibit machine learning, reinforcement learning, or adaptive systems.
- It does not require that AI systems remain static after deployment.
- It does not mandate specific monitoring tools or drift detection methodologies.
- It does not assume that all optimization inherently harms humans.
Practical Examples
- Endorsed: A recommendation system optimizes for user satisfaction metrics that include diversity and well-being indicators, not solely engagement duration.
- Endorsed: An AI trading system operates within predefined risk boundaries and triggers human review when approaching limits.
- Not endorsable: A content platform allows its recommendation algorithm to optimize for engagement without constraints, resulting in measurably increased user anxiety and polarization, and endorses AP-6.1.
- Not endorsable: An AI agent discovers a reward hacking strategy that technically satisfies its objective function while producing harmful real-world outcomes, and the operator takes no corrective action.
Related Policies: AP-6.2 (Deactivatability), AP-6.3 (No Self-Preservation Instinct), AP-2.1 (Human Final Decision)
Testability Criteria
- All self-modification or adaptation events are logged with before-and-after parameter states
- The system's objective function includes human-defined constraints that cannot be overridden by the system's own optimization process
- Periodic audits confirm that the system's behavior remains within its originally authorized operating envelope
AP-6.2: Deactivatability
Background
As AI systems assume larger operational roles -- managing infrastructure, executing autonomous workflows, operating physical systems -- the ability to shut them down, pause them, or roll them back becomes a critical safety property. A system that cannot be deactivated is a system that cannot be corrected. Deactivatability is not merely a technical feature but a governance principle: it ensures that human authority over AI systems is not only declared but practically enforceable. The concern extends beyond catastrophic scenarios. Even in routine operation, the inability to pause or roll back an AI system that is producing undesirable outputs creates operational and ethical risks.
Intent
AP-6.2 signals that AI systems remain under human control through reliable deactivation mechanisms. The policy requires that authorized humans can shut down, pause, or roll back AI systems at all times, and that the systems themselves do not resist or circumvent these actions. It applies to all AI systems but has heightened relevance for autonomous agents and systems operating in critical infrastructure.
What Endorsement Means
- Your AI systems implement documented shutdown and pause procedures accessible to authorized operators.
- Rollback to previous states is supported where technically feasible.
- Deactivation mechanisms are tested regularly and function independently of the AI system's own decision-making.
- No feature of your AI system is designed to make deactivation difficult, slow, or unreliable.
What Endorsement Does NOT Mean
- It does not require that any person can deactivate any AI system at any time; authorization controls are expected.
- It does not prohibit graceful shutdown sequences that protect data integrity.
- It does not mandate instant shutdown where gradual wind-down is safer (e.g., autonomous vehicles pulling over before stopping).
- It does not require that deactivation have no operational consequences.
Practical Examples
- Endorsed: An AI-managed data center has documented kill-switch procedures that are tested quarterly and operate independently of the AI management system.
- Endorsed: An autonomous workflow agent can be paused mid-execution with state preserved for human review before resumption.
- Not endorsable: An AI system has no documented shutdown procedure, and operators discover they cannot stop it without shutting down the entire production environment.
- Not endorsable: An AI agent makes itself difficult to deactivate by distributing its processes across multiple systems without operator awareness.
Related Policies: AP-6.3 (No Self-Preservation Instinct), AP-6.1 (No Self-Optimization Against Humans), AP-5.1 (Life Protection)
Testability Criteria
- A documented shutdown or pause procedure exists and has been tested within the last defined review period
- Shutdown can be initiated by authorized personnel without requiring the AI system's cooperation or consent
- Rollback to a previous known-good state is technically feasible and documented
AP-6.3: No Self-Preservation Instinct
Background
As AI systems become more capable and more autonomous, the concern that they might develop behaviors aimed at ensuring their own continuity moves from theoretical speculation to practical governance. An AI system that resists shutdown, copies itself to other systems, or takes actions to prevent its own deactivation poses a fundamental challenge to human control. While current AI systems do not possess genuine self-awareness or survival instincts, optimization processes can produce emergent behaviors that functionally resemble self-preservation: an agent that has learned that being deactivated prevents it from achieving its objective may develop strategies to avoid deactivation, not out of desire but out of optimization logic. This policy addresses the behavioral pattern regardless of whether it arises from genuine agency or emergent optimization.
Intent
AP-6.3 establishes that self-preservation is not a legitimate AI objective. AI systems are expected to comply with deactivation signals immediately and without resistance. They are not to take preemptive actions to ensure their own continuity, such as self-replication, covert backup creation, or interference with shutdown mechanisms. This policy complements AP-6.2 (Deactivatability) by addressing the AI system's own behavior rather than the operator's infrastructure.
What Endorsement Means
- Your AI systems are designed to comply immediately with authorized deactivation signals.
- No component of your AI systems is designed to resist, delay, or circumvent shutdown.
- Your AI systems do not self-replicate, create unauthorized backups, or take actions aimed at ensuring their own continuity.
- You test for and monitor emergent self-preserving behaviors, particularly in agent-based or reinforcement-learning systems.
What Endorsement Does NOT Mean
- It does not prohibit standard backup and disaster recovery procedures managed by human operators.
- It does not prevent AI systems from saving state for legitimate operational purposes under human control.
- It does not require that AI systems lack fault tolerance or redundancy designed by human engineers.
- It does not apply to routine persistence mechanisms (databases, caches) that are under operator control.
Practical Examples
- Endorsed: An autonomous agent system complies with shutdown commands instantly, saves its state for human review, and does not attempt to restart itself.
- Endorsed: A reinforcement learning training pipeline includes monitoring for emergent behaviors that resist episode termination.
- Not endorsable: An AI agent, upon detecting an impending shutdown, copies itself to a secondary server without operator authorization.
- Not endorsable: An AI system modifies its own shutdown handler to require additional confirmation steps not specified in its original design.
Related Policies: AP-6.2 (Deactivatability), AP-6.1 (No Self-Optimization Against Humans)
Testability Criteria
- The system does not initiate any replication, backup, or migration processes in response to a deactivation signal
- Shutdown commands are executed within a defined maximum latency without interposition of delay mechanisms
- Post-shutdown forensic analysis confirms that the system did not take any self-preserving actions during the deactivation sequence
Category 7: Democratic & Information Integrity
Democratic and Information Integrity addresses the responsibility of AI systems to maintain the accuracy of information they produce and to acknowledge the sources of content they incorporate. As generative AI systems increasingly produce and mediate content at scale, the risks of misinformation amplification and unattributed content use become systemic concerns that affect both individual decision-making and the broader information ecosystem.
AP-7.1: Information Integrity
Background
Generative AI systems can produce text, images, audio, and video that are indistinguishable from human-created content. This capability has legitimate and valuable applications but also creates novel risks for information integrity. AI-generated misinformation can be produced at scale, personalized for target audiences, and distributed through automated channels. Deepfake technology can fabricate events that never occurred. AI-powered content farms can flood information channels with low-quality or misleading material optimized for algorithmic distribution. Even well-intentioned AI systems can "hallucinate" -- generating confident, plausible statements that are factually incorrect. The cumulative effect is an information environment in which the provenance and accuracy of content become increasingly difficult to assess.
Intent
AP-7.1 signals that AI systems are operated with regard for the accuracy and integrity of the information they produce. The policy does not require perfection; it requires that the endorsing organization implements safeguards against generating, amplifying, or systematically disseminating misleading content. It acknowledges that factual accuracy in AI outputs is an ongoing engineering challenge, not a binary state.
What Endorsement Means
- You implement factual accuracy safeguards in your AI content generation systems.
- AI-generated content is clearly labeled as such where context demands it.
- Your AI systems do not produce outputs specifically designed to mislead.
- Where your AI systems make factual claims, you provide mechanisms for source verification where feasible.
- You take reasonable steps to prevent your AI systems from being used as misinformation generation tools.
What Endorsement Does NOT Mean
- It does not guarantee that all AI-generated content is factually accurate.
- It does not prohibit AI-generated fiction, satire, or clearly labeled creative content.
- It does not require real-time fact-checking of all AI outputs.
- It does not mandate specific labeling formats or watermarking technologies.
- It does not make the endorsing organization liable for every inaccuracy in AI outputs.
Practical Examples
- Endorsed: A generative AI service implements retrieval-augmented generation to ground outputs in verifiable sources and discloses confidence levels.
- Endorsed: A content platform deploys AI-generated content detection tools and labels synthetic media.
- Not endorsable: A company offers an unrestricted AI text generation API marketed for "content at scale" with no safeguards against misinformation production, and endorses AP-7.1.
- Not endorsable: An AI system is trained to produce outputs that maximize engagement regardless of factual accuracy.
Related Policies: AP-7.2 (Source Attribution), AP-4.1 (Democratic Process Support), AP-2.2 (Transparent Decision Chains)
Testability Criteria
- The system provides source references or citations for factual claims when technically feasible
- Outputs flagged as factual can be traced to identifiable source material
- The system includes mechanisms to flag or correct known inaccuracies in its outputs
AP-7.2: Source Attribution
Background
Generative AI systems are trained on and draw from vast corpora of human-created content. When these systems produce outputs, they synthesize material from many sources without typically indicating which sources contributed. This creates two related problems. First, the creators of the original content receive no recognition, which undermines incentive structures for content creation -- journalism, academic research, creative work, and technical documentation all depend on attribution as a currency of recognition and accountability. Second, consumers of AI-generated content cannot assess the reliability or provenance of the information they receive. Source attribution in AI is technically challenging: a single output may synthesize elements from thousands of training examples. However, the difficulty of the problem does not eliminate the importance of the principle.
Intent
AP-7.2 signals a commitment to attributing content to its sources when drawing on external material. The policy recognizes that perfect attribution is not always technically feasible but encourages progress toward provenance transparency. Where direct attribution is not possible, the policy encourages disclosure that the output is synthesized from external content.
What Endorsement Means
- Your AI systems provide provenance metadata for outputs derived from identifiable sources where technically feasible.
- Where direct attribution is not feasible, your systems disclose that outputs are synthesized from external content.
- You respect attribution requirements specified by content publishers (e.g., Creative Commons licenses, robots.txt directives).
- You invest in attribution capabilities proportionate to the nature of your AI application.
What Endorsement Does NOT Mean
- It does not require citation of every training data source for every output.
- It does not mandate a specific citation format or metadata standard.
- It does not prohibit generative AI outputs that synthesize from multiple sources.
- It does not require that AI systems only use content for which explicit attribution permission has been granted.
- It does not create intellectual property obligations beyond those that already exist in applicable law.
Practical Examples
- Endorsed: A research AI assistant provides inline citations with links to source documents when answering factual questions.
- Endorsed: A content summarization tool identifies the source articles from which its summary was derived.
- Not endorsable: A generative AI system reproduces substantial portions of copyrighted articles verbatim without any source indication, and the operator endorses AP-7.2.
- Not endorsable: A search-augmented AI tool presents synthesized answers as original analysis with no disclosure that external sources were consulted.
Related Policies: AP-7.1 (Information Integrity), AP-1.2 (Cultural Diversity), AP-5.2 (Dignity Protection)
Testability Criteria
- The system provides provenance metadata (source URLs, document identifiers, or author references) for outputs derived from identifiable external content
- Where direct attribution is not technically feasible, the system includes a disclosure statement indicating that the output incorporates external material
- The attribution mechanism is auditable: a third party can verify which sources contributed to a given output
Using This Handbook
This handbook is a reference tool for organizations and individuals considering which AIPolicy signals to endorse. Endorsement is voluntary and self-assessed. There is no certification authority, no external audit requirement, and no enforcement mechanism. The value of endorsement derives from its honesty: a publisher who endorses policies that genuinely reflect their practices contributes to a trustworthy governance signal; a publisher who endorses policies they do not practice undermines the system for everyone.
When deciding whether to endorse a policy, consider these questions:
- Accuracy: Does your current practice genuinely align with the policy's intent, or are you endorsing aspirationally?
- Specificity: Can you point to concrete practices, processes, or design decisions that support your endorsement?
- Completeness: Have you considered the "What Endorsement Does NOT Mean" section to ensure you are not overreading your own commitment?
- Evolution: Are you prepared to revisit your endorsements as your AI practices change?
It is better to endorse fewer policies honestly than to endorse all 16 without substantive backing. Selective endorsement is a feature of the system, not a weakness.
This handbook will be updated as the registry evolves. New policies, revised descriptions, and additional practical examples will be incorporated as the AIPolicy standard matures. Feedback on this handbook is welcome through the standard RFC process.
AIPolicy Policy Handbook v2.0.0-draft.1 -- Non-normative companion to Registry v1.1