Agent Experience First Design (AXFD): Designing Systems for a Non-Human User

Agent experience first design illustration — Source: theverge.com

For the last two decades, “user experience” meant human experience: interfaces, flows, copy, and conversion.

But in a growing class of products, the primary operator is no longer a person. It’s an agent: a loop that reads state, plans, calls tools, evaluates results, retries, and keeps going. Humans still set goals and approve outcomes, but the day-to-day “clicking” is done by a probabilistic caller.

This post proposes a framing — and a name — for the design discipline that follows:

Agent Experience First Design (AXFD) A product and systems design approach that treats the agent as a first-class user, designing around how it perceives state, takes actions, incorporates memory, learns from feedback, and recovers from failure.

The punchline is simple:

In agentic systems, reliability is mostly interface design plus recovery engineering.

The mental model: your new user is non-deterministic

Traditional software assumes a user who:

understands implied context,
can tolerate ambiguity,
can improvise around unclear errors,
can stop and ask for help.

Agents do none of that reliably. They are:

highly capable but highly literal,
prone to overconfident mistakes,
eager to retry,
willing to loop forever.

If you design your system like it’s being used by a careful human, an agent will eventually turn your UX into an outage.

So AXFD starts from a different premise:

Assume the operator will misunderstand you. Assume it will repeat itself. Assume it will run at scale.

Why AXFD is not just “API design” or “prompting”

AXFD sits in a real gap between disciplines:

It’s not prompt engineering

Prompting improves “thinking inside the model.” AXFD improves the world outside the model: tool contracts, state surfaces, error semantics, checkpoints, and observability.

It’s not traditional API-first

API-first is “design the interface before the UI.” AXFD is “design the interface for a caller that can hallucinate.”

That difference matters: API contracts designed for deterministic clients often collapse under non-deterministic behavior (retries, partial success, ambiguous state, untyped errors).

It’s not UX (but it changes UX)

You still need human UX — approvals, review, trust, affordances. But AXFD recognizes a second surface:

Human surface: UI, explanations, controls, approvals
Agent surface: schemas, typed failures, explicit state, checkpoints, traces

Modern products increasingly win or lose on the second surface.

Why now: two shifts made AXFD practical (not philosophical)

1) Outputs can finally behave like interfaces

OpenAI’s Structured Outputs were introduced specifically to ensure model outputs “exactly match JSON Schemas provided by developers.” (OpenAI) This is a crucial shift: “valid JSON” is not enough; agents need schema conformance.

Similarly, OpenAI’s function/tool calling is designed around tools “defined by a JSON schema.” (OpenAI) This means contracts can be enforced at the boundary, not wished into existence with prompt wording.

2) Tool ecosystems are being standardized

MCP (Model Context Protocol) explicitly positions itself as a standardized way to connect AI applications to external systems — “like USB-C.” (Model Context Protocol) And the Linux Foundation’s Agentic AI Foundation (AAIF) launched with founding contributions including MCP and OpenAI’s AGENTS.md, signaling that agent interoperability is becoming infrastructure. (linuxfoundation.org)

When connectivity becomes standardized, differentiation shifts from “can you connect tools?” to “can your agent operate safely and reliably?”

That’s AXFD.

Positioning: where AXFD comes from (and why it’s a new center)

AXFD is new as a name, but it’s a convergence of older ideas — recentered around a new “user.”

Unix philosophy: programs are users too

A common summary attributed to McIlroy is: write programs to work together, and handle text streams as a universal interface. (cscie2x.dce.harvard.edu) AXFD inherits the same spirit: predictable outputs, composability, and explicit failure semantics.

DevEx: experience drives throughput

DevEx research and practice often distill productivity into feedback loops, cognitive load, and flow state. (queue.acm.org) AXFD is DevEx for a different kind of developer: an agent that must maintain “flow” across tool calls, keep cognitive load low via legible state, and rely on tight feedback loops from typed errors and fast verification.

RL: the environment is half the algorithm

OpenAI Gym’s framing is straightforward: a “standard API to communicate between learning algorithms and environments.” (GitHub) AXFD applies the same lesson to LLM agents: if you improve the environment (tools, state, feedback), the same agent becomes dramatically more capable.

GitOps/ChatOps: long-running automation needs recovery + observability

The operational world already learned that automation must be observable, declarative, and recoverable. AAIF/MCP suggest agent systems are going the same way — into standardization and ops-grade expectations. (linuxfoundation.org)

AXFD is what you get when those traditions collide with non-deterministic tool callers.

A practical definition: the Agent Experience Stack

AXFD designs for five “experiences” that decide whether agents thrive or spiral.

1) Perception: can the agent see the world clearly?

Agents need state that is:

machine-readable (not vibes),
stable in format,
referenceable (IDs, versions, hashes),
diffable (artifacts, patches).

If the system’s truth is buried in prose, the agent’s plan becomes fiction.

2) Action: can the agent act safely and predictably?

Assume duplicates and retries.

HTTP gives a crisp anchor: a method is idempotent if multiple identical requests have the same intended effect as one request. (RFC ) AXFD generalizes this idea to all agent actions:

idempotency keys,
transactional boundaries,
compensating actions,
“plan then execute.”

3) Memory: can the agent remember without poisoning itself?

AXFD treats memory like an OS problem:

short-term working set vs long-term retrieval,
provenance and citations,
expiration and invalidation,
compression and checkpoints.

A “chat log” is not memory. It’s entropy.

4) Feedback: can the agent tell what happened and what to do next?

Human UX can say “Something went wrong.” Agent UX cannot.

Typed failures should encode:

error_type,
retryable,
hint (next best action),
partial_results (what is safe to reuse).

5) Recovery: can the agent resume and rewind?

Long-running agents will break mid-flight:

timeouts,
tool outages,
context drift,
partial side effects.

AXFD requires:

checkpoints,
resumable runs,
replayable tool calls,
safe re-entry.

AXFD’s core framing: “APIs are the new UI”

Here’s the sharp version you can use as your blog’s central frame:

APIs are the new UI. (Agents “click” endpoints and tools.)
Schemas are the new design system. (Consistency and constraints at scale.)
Errors are the new copywriting. (They shape behavior more than success.)
Checkpoints are the new Undo button. (They turn failure into progress.)

Or even shorter:

Design for the operator that never sleeps and never stops retrying.

The principles (positioning-friendly and actionable)

1) Contract first

Use JSON Schema / OpenAPI-level contracts wherever possible. Structured Outputs exist to guarantee schema conformance, not just well-formed JSON. (OpenAI)

Tool calling should be schema-defined rather than “best effort” parsing. (OpenAI)

2) Machine-readable by default

Human-friendly narratives are optional; structured returns are mandatory.

3) Typed failure is a feature

Vague errors create agent loops. Typed errors create recovery.

4) Idempotent and re-entrant operations

Assume the agent will call it twice. Design it so the second call is safe. (RFC)

5) Checkpoint everything that matters

Make progress resumable; make state auditable.

6) Observability first

If you can’t debug the agent’s run deterministically, you can’t scale it.

7) Budgeted autonomy and least privilege

Agents should earn trust through constraints (scopes, approvals, rate limits, cost ceilings).

8) Standardize connectivity; differentiate on experience

MCP and AAIF signal that interoperability is moving toward standard infrastructure. (Model Context Protocol) Your moat shifts to how safely and reliably agents can operate in your system.

A maturity model for products (useful for positioning)

Level 0 — Vibes: free-form text everywhere, manual recovery. Level 1 — Parseable: mostly stable output, brittle retries. Level 2 — Contracted: schemas, validation, typed failures, explicit state surfaces. (OpenAI) Level 3 — Recoverable: checkpoints, resumability, idempotency keys, replayable runs. Level 4 — Operable at scale: policy gating, budgets, observability, standardized tool ecosystem (MCP-class). (Model Context Protocol)

AXFD is the discipline of climbing this ladder.

A copy-paste AXFD checklist (ship this with your repo)

Contracts

Every tool has a schema, examples, and strict validation. (OpenAI)
Outputs are versioned and backward compatible.

Failure semantics

Errors include error_type, retryable, hint, and partial_results.

State

State is explicit, snapshot-able, and diffable.

Safety

Side-effectful actions are idempotent or transactional. (RFC )
Least-privilege tool scopes; high-risk actions are gated.

Recovery

Runs are resumable from checkpoints.
Supports “plan then execute.”

Observability

Structured logs + trace IDs for every tool call.

Closing: the next moat after standards

If MCP-style protocols make “connecting tools” cheap and ubiquitous, the next moat is not the connector.

It’s the experience you provide to the operator that will:

misunderstand your system,
retry aggressively,
run at scale,
and keep going anyway.

That operator is the agent.

AXFD is simply the name for designing your system so the agent doesn’t go insane inside it.

[1] https://openai.com/index/introducing-structured-outputs-in-the-api/?utm_source=chatgpt.com Introducing Structured Outputs in the API

[2] https://platform.openai.com/docs/guides/function-calling?utm_source=chatgpt.com Function calling | OpenAI API

[3] https://modelcontextprotocol.io/?utm_source=chatgpt.com What is the Model Context Protocol (MCP)? — Model Context …

[4] https://www.linuxfoundation.org/press/linux-foundation-announces-the-formation-of-the-agentic-ai-foundation?utm_source=chatgpt.com Linux Foundation Announces the Formation of the Agentic …

[5] https://cscie2x.dce.harvard.edu/hw/ch01s06.html?utm_source=chatgpt.com Basics of the Unix Philosophy

[6] https://queue.acm.org/detail.cfm?id=3595878&utm_source=chatgpt.com DevEx: What Actually Drives Productivity

[7] https://github.com/openai/gym?utm_source=chatgpt.com openai/gym: A toolkit for developing and comparing …

[8] https://www.rfc-editor.org/rfc/rfc9110.html?utm_source=chatgpt.com RFC 9110: HTTP Semantics