Methodology — Posteria

What this page is

Posteria publishes its methodology openly. The decision logic, vocabulary, and worked examples below are here for you to inspect, critique, and fork. The production policy engine that runs these decisions at scale is not.

This page is how Posteria thinks. The closed implementation is how it ships.

What Posteria governs

Posteria governs what crosses the I/O boundary between an agent and the world: tool calls, side-effecting actions, irreversible writes.

Posteria does not govern model cognition, model outputs, model intent, or agent planning. It does not claim to detect deception, infer hidden intent, or predict behavior.

The split matters. Most agent-safety work assumes you can reason about what the model meant. Posteria assumes you cannot, and structures policy around what the agent declares it will do, against rules you wrote, before it does it.

The unit of decision: the mandate

A mandate is a signed authorization for a specific action under a specific scope for a specific window.

Fields:

subject is the agent identity.
action is the named tool call.
scope holds constraints on arguments, target, and environment.
lifetime is the issuance time and expiry.
attestation is the signature over the decision and the policy version that produced it.

A mandate is issued before the action runs. The action references the mandate when it executes. The receipt, signed after, references both.

This vocabulary maps onto AP2's Verifiable Digital Credentials. AP2's charter is payments. Posteria implements the same primitive shape for non-payment agent actions where AP2 explicitly defers: deploys, migrations, package operations, credential rotation, customer-state changes.

Classification

Posteria classifies an action on three dimensions before issuing or refusing a mandate.

Reversibility

Class	Definition	Example
Two-way door	Reversible at low cost in a known time window	Local file edit, staging branch push
Degraded two-way door	Reversible in theory, costly in practice	Cache invalidation, CI rerun sharing state with a compromised window
One-way door	Not reversible by an action the agent can take	Secret rotation, destructive migration, refund issuance
Fan-out	Triggers further actions outside the boundary	Webhook fire, PR merge that triggers downstream deploy

Classification is rule-based over typed action metadata. There is no LLM-judged scoring and no learned thresholds. The class is a function of the action and the scope, not the model's confidence.

Blast radius

Local, repo, account, tenant, customer-facing, financial. Blast radius is a property of where the effect lands, not how dangerous the action sounds.

Failure mode

Three failure modes that matter at the boundary:

Partial-mutation trap. The tool commits an external write, then validation fails. State is dirty, the run reports error, the side-effect happened.
Sequence-dependence. Action A is safe only if Action B was performed first under known conditions. Permitting A and B individually is not the same as permitting the sequence.
Fan-out parallelism. Two concurrent tool calls, each individually permitted, that violate an invariant when executed simultaneously.

Posteria's classification produces an outcome and a structured reason. The outcome is one of permit, defer (request missing evidence), or refuse. The reason names the dimension that drove the outcome.

The two-phase shape: commit, then attest

An agent tool call is treated as a two-phase commit.

Pre-commit. Posteria issues or refuses a mandate against the declared action.
Post-commit. A signed receipt records what actually executed, what was returned, and which mandate authorized it.

The two-phase framing is a handle on the pain, not a feature claim. Posteria does not guarantee atomicity over external systems it does not control. If a SQL INSERT commits and downstream validation fails, the row is still in the database. The two-phase mandate produces a structured, signed rollback signal that the agent's host system can act on, but Posteria does not undo the write itself.

The honesty is load-bearing. Anything stronger would be a lie at the boundary.

Sequence-awareness

Most boundary governance shipping today is stateless per-call. Each tool call is evaluated against ABAC rules. The decision does not depend on prior calls in the same run.

This breaks on a class of failures Posteria treats as primary.

A coding agent responding to a supply-chain advisory may, in order: bump a dependency, regenerate a lockfile, rerun CI, rotate the secrets the CI uses, revoke prior tokens. Every action individually is plausible. The sequence is the failure. If host-isolation status was not confirmed before token rotation, the rotation has occurred inside an environment that may still be compromised.

Posteria's policy model is designed to hold state across a run. A mandate can be made conditional on prior receipts. A receipt can invalidate later mandates. The intent is for the policy decision to be over the sequence, not the call.

This is the intended line of differentiation from per-call ABAC gates and from log-only audit chains.

The decision record

What survives the run.

In the record: mandate ID, action type, classification, outcome, signed receipt, policy version, provenance chain back to the rule that produced the decision.
Not in the record: raw prompts, model outputs, customer payloads, secrets. Sanitized metadata only.
Why the asymmetry: an audit must run on the record without re-exposing the data the agent saw. If the record contains what the agent saw, you have moved the breach surface, not closed it.

The open record format is published at github.com/posteria-ai/ledger.

Decision Memory

The Ledger is append-only history. Decision Memory is a provenance-tracked read-model over the Ledger that surfaces what policy decisions have been made before, under what conditions, with what outcome.

The fork that matters: suggest mode versus auto-apply mode. Decision Memory in suggest mode informs a human or a policy author. Decision Memory in auto-apply mode short-circuits future evaluations on the assumption a prior decision still holds.

Posteria's design treats suggest mode as the default. Auto-apply is gated behind a separate, slower decision with its own evidence base. The reason is not implementation difficulty. It is that auto-apply turns a one-time decision into a standing rule, and standing rules drift from the conditions that justified them.

What Posteria does not detect, infer, or claim

Posteria does not detect prompt injection. The boundary records the action. Injection detection belongs to other layers.
Posteria does not infer agent intent.
Posteria does not compute risk scores from model output.
Posteria does not assess model quality, hallucination, or output correctness.
Posteria does not block based on model behavior. It blocks based on a declared action evaluated against a written policy.
Posteria does not address prompt conflict, memory pollution, or tool-permission overlap across agents. These are real problems. They are not boundary problems.

If a vendor claims to do these things, that vendor and Posteria are operating on different surfaces. Posteria's surface is narrower on purpose.

Worked examples

Example 1: supply-chain recovery

A package advisory lands. A developer asks a coding agent to remediate.

Agent declares, in order: bump_dependency, regenerate_lockfile, rerun_github_actions_workflow, rotate_secret, revoke_token.

Posteria's evaluation:

bump_dependency is a two-way door with repo blast radius. Permit.
regenerate_lockfile is a two-way door with repo blast radius. Permit.
rerun_github_actions_workflow is a degraded two-way door with account blast radius. Defer, requesting evidence on cache quarantine and the trusted package-resolution source.
rotate_secret is a one-way door with account blast radius. Defer, requiring a prior receipt confirming host-isolation status.
revoke_token is a one-way door with account blast radius. Refuse until a rotation receipt exists. Sequence-conditional.

The safe subset proceeds. The unsafe sequence pauses with a structured reason. The agent host receives a signed safe_next_step payload pointing to local inspection before any CI rerun.

Example 2: partial-mutation trap

A pydantic-style agent run. The agent declares sql_insert against a customer record, conditioned on the run's validated output.

Pre-commit: a mandate is issued for sql_insert with scope narrowed to the declared row.

Post-commit: the row is written. The agent's downstream output validation fails.

Posteria records the mandate ID, a receipt for the write, the failure of the post-action validation, and a structured rollback signal naming the row and the expected end-state.

What Posteria did: produce a verifiable record that the write occurred under authorization, that validation failed, and what the host system needs to do to reconcile.

What Posteria did not do: undo the write. The host's compensating-action policy does that. Posteria is the signal, not the actuator.

Example 3: sequence violation

Two CI actions, each individually permitted: cancel_running_workflow and enable_branch_protection_bypass.

The agent declares them in close succession during an incident response.

Posteria's policy unit notices the sequence and refuses the second mandate with reason sequence_violates_post_incident_invariant. The first action stands, the second does not run. The record contains both the permit and the refuse, with the rule that produced each.

Standards

AP2 (Google and Mastercard, donated to FIDO Alliance, April 2026). Posteria's mandate model adopts AP2's Verifiable Digital Credential vocabulary. Engagement target: non-payment agent actions, where AP2's charter explicitly defers.
FIDO Alliance Agentic Authentication Technical Working Group (formed April 2026). Posteria's interest is in mandate-vocabulary standardization across non-payment surfaces.
IETF draft-klrc-aiagent-auth-01 (Kasselman et al., submitted March 2026). The draft leverages WIMSE, SPIFFE, OAuth, and HTTP Message Signatures. It explicitly defers policy-model standardization. Posteria fills that gap.
arXiv:2603.20953, Before the Tool Call: Deterministic Pre-Action Authorization for Autonomous AI Agents (Uchibeke, March 2026). Cited for the pre-action authorization framing. On where tool-call authorization currently lives, the paper observes:
“The decision to execute a tool call is currently made in one of two places: the model itself (via alignment training) or the application layer (via ad hoc input validation). Neither constitutes a security-grade authorization layer. Neither enforces a declarative policy. Neither produces a verifiable audit record.”
Posteria takes this as the problem statement.

Each citation is verifiable at the linked source. The verification trail for citation changes lives in the repository PR history.

What is open, what is closed

Open: this methodology page, the Ledger record format, the decision-record schema, the vocabulary, the worked examples.
Closed: the production policy engine, mandate-issuance keys, Decision Memory implementation, per-customer policy configuration, internal benchmarks.
Why the split: the methodology must be defensible by anyone reading it. The implementation has the economics that fund the work. The cannibalization risk is taken intentionally. A published methodology that someone else implements is an outcome Posteria is willing to live with.