v2.0 Draft

Open Issues -- AIPolicy Specification v2.0

Document Identifier: AIPOLICY-ISSUES Status: Working Document Version: 2.0.0-draft.1 Date: 2026-02-07 Editor: Guido Mitschke Repository: https://gitlab.com/human-first-ai/hf-ai-web-standard


Overview

This document tracks open technical issues in the AIPolicy specification that require further discussion, research, or community input before resolution. Each issue describes a specific gap or ambiguity in the current specification draft.

Issues are tracked by number and status. Resolved issues are moved to the changelog of the specification version in which they were addressed.

# Issue Status
1 Pipeline Weighting Open
2 Norm Persistence Open
3 Schema.org Extension Open
4 Conflict Resolution Open
5 Versioning Strategy Open
6 Measurement Methodology Open

ISSUE-1: Pipeline Weighting

Description

When aggregators, researchers, or AI training pipelines collect AIPolicy declarations from multiple domains, no normative guidance exists for how declarations from different sources should be weighted relative to each other. A declaration from a personal blog and a declaration from a government institution use the same format and schema. The specification intentionally does not differentiate between publishers based on authority, audience size, or institutional status.

However, practical aggregation systems will need to make weighting decisions. If a crawler collects declarations from 10,000 domains, how should it aggregate conflicting endorsement statuses for the same policy? Simple majority voting, domain-authority weighting, and equal weighting each produce different aggregate signals.

This issue is distinct from the question of how AI training pipelines weight data internally (which is outside the specification's scope). It concerns the normative guidance the specification should or should not provide to aggregation tools built on top of the standard.

Current Status

Open. The specification currently provides no guidance on aggregation. The publisher object includes name and url but no authority metadata.

Proposed Approaches

  1. No normative guidance. The specification defines the publication format only. Aggregation is out of scope. This is the current implicit position.
  2. Non-normative aggregation annex. A supplementary document could describe common aggregation approaches without recommending any specific method.
  3. Publisher metadata extension. Optional fields (e.g., organizationType, jurisdiction) could provide aggregation-relevant metadata without prescribing how to use it.
  • Section 5 (Publisher Object)
  • Section 7 (Conformance Levels)
  • Mechanism Analysis (non-normative)

ISSUE-2: Norm Persistence

Description

The specification defines an expires field that indicates when a declaration should be considered stale. However, the semantics of expiration are limited to the declaration itself. Once a declaration is removed from a website or its expires date passes, the specification does not address how long the previously published signal remains relevant.

For inference-time retrieval systems, expiration is straightforward: a retrieval system can check the expires field and discard stale declarations. For training-time inclusion, the question is more complex. If a declaration was included in a training corpus at time T, and the declaration is removed from the web at time T+1, the training data still contains the declaration. Subsequent training runs may or may not include updated crawl data.

This raises a broader question about the temporal semantics of governance signals. Does a published declaration represent a permanent statement, a time-bounded assertion, or a continuously maintained signal that requires periodic renewal?

Current Status

Open. The expires field provides a mechanism for time-bounding declarations, but no guidance exists on how expired or removed declarations should be treated by systems that have already ingested them.

Proposed Approaches

  1. Expiration is advisory. The specification states that expires is a recommendation to consumers, not a guarantee. Consumers SHOULD discard expired declarations but are not required to.
  2. Renewal semantics. Introduce an optional renewalInterval field that signals how often a consumer should re-fetch the declaration.
  3. No additional guidance. Accept that training-time persistence is outside the specification's control and document this as a known limitation.
  • Section 4.2 (Declaration Metadata: published, expires)
  • Mechanism Analysis, Section 1.1 (Training-Time Inclusion)

ISSUE-3: Schema.org Extension

Description

The current specification uses standard JSON at a well-known URI. For broader interoperability with the semantic web ecosystem, a formal Schema.org extension could define types such as AIGovernanceDeclaration, AIPolicy, or PolicyEndorsement. This would allow declarations to be embedded in HTML pages as JSON-LD <script> blocks, discovered by search engine structured data parsers, and linked to existing Schema.org vocabularies.

The Schema.org community group has an established process for proposing extensions. Submitting a formal proposal requires a stable vocabulary, use cases, and evidence of adoption. The current draft status of the specification may be premature for a Schema.org submission.

An intermediate approach uses Schema.org's additionalProperty mechanism to attach policy metadata to existing types (e.g., WebSite, Organization) without requiring a formal extension. This approach is already used in the specification's examples but limits discoverability and semantic precision.

Current Status

Open. The specification uses additionalProperty as an interim approach. No formal Schema.org proposal has been drafted.

Proposed Approaches

  1. Defer until Candidate Standard. Wait until the specification reaches a more stable status before proposing a Schema.org extension.
  2. Community Group proposal. Draft a Schema.org extension proposal for community review in parallel with the specification's development.
  3. Standalone vocabulary. Define a standalone JSON-LD vocabulary at https://aipolicy.org/vocab/ without depending on Schema.org acceptance.
  • Section 3 (Declaration Format)
  • Section 8 (Interoperability)

ISSUE-4: Conflict Resolution

Description

When multiple declarations exist for overlapping scopes, the specification needs clear precedence rules. Consider the following scenarios:

  • A site-wide declaration at /.well-known/aipolicy.json endorses policy AP-2.1. A page-level declaration embedded in a specific page's metadata does not endorse AP-2.1. Which takes precedence for that page?
  • An organization publishes a declaration on its corporate domain and a different declaration on a product subdomain. If the scopes overlap (e.g., site vs. section), how are conflicts resolved?
  • A CDN or hosting platform injects a default declaration that the site operator has not explicitly approved. Is the well-known URI authoritative even if the site operator did not place it there?

The current specification states that the well-known URI (/.well-known/aipolicy.json) is the authoritative source. However, this does not fully address page-level overrides or multi-domain scenarios.

Current Status

Open. The specification establishes well-known URI authority but does not define a complete precedence model for overlapping scopes.

Proposed Approaches

  1. Strict well-known authority. The well-known URI is always authoritative. Page-level metadata is informational only and cannot override the site-level declaration.
  2. Cascading precedence. More specific scopes override less specific scopes (page > section > site), analogous to CSS specificity.
  3. Explicit override mechanism. Introduce an overrides field that allows page-level declarations to explicitly reference and override specific policies from the site-level declaration.
  • Section 4.1 (Scope)
  • Section 6 (Discovery)

ISSUE-5: Versioning Strategy

Description

The policy registry is versioned independently from the specification. When a new registry version deprecates or renumbers policy IDs, existing declarations that reference the old IDs become partially invalid. The specification needs a strategy for handling this transition.

Consider: a declaration published in 2026 references policy AP-3.1 as defined in registry version 1.0. In 2027, registry version 2.0 renumbers this policy to AP-3.1.1 and changes its statement text. The declaration's policy reference is now ambiguous -- does it refer to the original or the updated policy?

This issue also affects validators. Should a validator accept declarations referencing deprecated policy IDs? Should it warn, reject, or silently accept them?

Current Status

Open. The current specification includes a registryVersion field in the declaration format. Validators currently accept deprecated policy IDs but emit warnings. No formal deprecation lifecycle is defined.

Proposed Approaches

  1. Accept with warning. Validators accept deprecated IDs and emit deprecation warnings. This is the current behavior but lacks formal specification.
  2. Grace period. Define a deprecation grace period (e.g., 12 months) during which old IDs remain valid. After the grace period, validators reject them.
  3. Aliasing. The registry maintains an alias table mapping old IDs to new IDs. Validators resolve aliases transparently.
  4. Immutable IDs. Policy IDs, once assigned, are never reused or renumbered. Deprecated policies are marked as deprecated but retain their original IDs.
  • Section 4.3 (Policy References)
  • Registry Specification (versioning section)
  • Schema (policies[].id validation)

ISSUE-6: Measurement Methodology

Description

A fundamental challenge for the AIPolicy standard is measuring whether published declarations have any detectable effect on AI system behavior. Without measurement, the standard's practical impact cannot be assessed, and adoption decisions lack an empirical basis.

The measurement problem is compounded by the fact that major AI training pipelines are proprietary. Researchers cannot inspect training data composition, model weights, or inference-time retrieval behavior for commercial models. Black-box testing (observing model outputs before and after declaration publication) is possible in principle but faces significant methodological challenges: confounding variables (other training data changes), long feedback loops (training cycles), and the difficulty of establishing causal attribution.

Open-weight models provide a partial avenue for research, as their training data and weights can be inspected. However, results from open-weight models may not generalize to proprietary systems with different architectures, training procedures, and data curation practices.

Current Status

Open. No measurement methodology has been proposed or validated. This is identified as a priority research question in the Mechanism Analysis document.

Proposed Approaches

  1. Canary testing. Publish unique, identifiable signals in declarations and test whether they appear in model outputs. This approach can detect training-time inclusion but not behavioral influence.
  2. A/B domain testing. Publish declarations on some domains but not others and compare AI system behavior regarding those domains over time. Requires controlling for confounding variables.
  3. Open-model benchmarks. Train or fine-tune open-weight models on corpora with and without AIPolicy declarations and measure behavioral differences. Results may not generalize to proprietary models.
  4. Inference-time validation. For RAG-based systems, test whether AI systems retrieve and reference declarations when answering queries about a specific domain. This is more tractable than training-time measurement.
  5. Community-contributed evidence. Collect observational reports from publishers who notice changes in AI system behavior regarding their domains after publishing declarations. Anecdotal but potentially useful for identifying patterns.
  • Mechanism Analysis (Sections 1 and 3)
  • Research directory (future measurement studies)