# ax - the agent experience layer

> ax is a local taste & telemetry graph for AI coding agents. It ingests
> transcripts from Claude Code, Codex, Pi, OpenCode, and Cursor into a local
> SurrealDB, then turns that history into evidence: which skills earn their
> keep, what work actually cost, where expensive models burn tokens on
> mechanical work, and which guardrails measurably help. Everything runs on
> the user's machine; nothing leaves it unless they share a session.

Site: https://ax.necmttn.com
Repo: https://github.com/Necmttn/ax
Install: `curl -fsSL https://ax.necmttn.com/install | bash`, then `npx skills add Necmttn/ax`

## What users can do

### Recall and reconstruct
- `ax recall <query>` - full-text search across turns, commits, and skills
- `ax sessions here | around <date> | near <sha> | show <id>` - windowed session queries; drill into any session turn by turn
- `ax:extract-workflow` skill - reconstruct the workflow behind a shipped artifact ("what made X work")
- `ax share <session-id>` - publish a sanitized session share via GitHub Gist, viewable in the hosted studio

### Skills intelligence
- `ax skills search | taste | unused | pairs | recovery | stats | weighted` - which installed skills actually get used, and what they pair with
- `ax skills classify / tag / lint / by-role / roles` - role-tag skills via review briefs; usage x role-weight ranking
- `ax skills config` - skill lifecycle (live / orphan / out-of-scope / parked)

### Cost and model routing
- `ax cost models | sessions | split` - spend by model; the split shows main-loop vs subagent spend so users see what their frontier model burns on mechanical work
- `ax dispatches [--candidates]` - subagent dispatches that inherited an expensive model + estimated savings at cheaper-model pricing; routed dispatches whose child continued on a different model (the harness drops the model override on continuations) are flagged with their dropped-leg cost
- `ax dispatches --economy [--days=N]` - spend-mode effectiveness lens: of inherit dispatches matching a route-down class, how many ran cheap (sonnet/haiku) vs expensive (fable/opus)? Shows overspend cost + est savings by routing class + count of route-dispatch Advise hook fires (unlinked - advice→outcome attribution deferred, no clean PreToolUse→spawn join available)
- `ax thinking` - reasoning-spend rollup: per-model Claude thinking volume (thinking-only turns' own output tokens), reasoning-effort distribution (codex turn_context effort + claude settings effortLevel), and codex reasoning tokens as a share of output
- `ax routing tune` - mine the user's own dispatch history for new routing classes; judgment work is never auto-routed (vetted via --emit-brief agent backtest)
- `ax routing compile` - regenerate the routing table that drives the route-dispatch hook and the efficient-dispatch skill, preserving user-mined classes (`ax dispatches compile-routing` is an alias)
- `ax routing show` - effective routing table with class origins
- `efficient-dispatch` skill - orchestration pattern: the main model keeps judgment and review, mechanical dispatches carry an explicit cheaper model
- `route-dispatch` hook - deterministic dispatch-time nudge when a mechanical dispatch forgets an explicit model
- `ax costs summary | for --session/--query/--commit/--branch` - what any session, feature, or branch cost in tokens and USD
- `ax quota [--statusline|--swiftbar|--json]` - live Claude plan usage (5h/7d windows) from the Anthropic usage endpoint, cached locally; one-line statusline output and a SwiftBar menubar plugin ship with it
- `ax dojo agenda [--json] [--spar] [--budget=N] [--until=HH:MM] [--force] [--days=N]` - training agenda for surplus-quota self-improvement: budget envelope (5h/7d window remaining minus reserve, deadline = window reset) + prioritized items (pending verdicts, unfilled briefs, routing backtests, proposal minting, churn experiments, optional sparring). Consumed lap-by-lap by the ax:dojo skill.
- `ax dojo report [--since=<iso>] [--notes-file=<path>] [--json]` - write the morning report for a completed dojo run: budget envelope, per-lap item log, proposals created, verdicts locked, outbox drafts awaiting review.
- `ax dojo draft [--title=<t>] [--kind=bug|improvement] [--body-file=<path>] [--session=<id>]` - stage an upstream finding (ax bug or improvement discovered during training) to `~/.ax/dojo/outbox/<slug>.md`; never publishes - user reviews and publishes via ax-repo skill / gh.
- `ax dojo outbox [--json]` - list staged upstream issue drafts in `~/.ax/dojo/outbox/`.
- `ax dojo spar-plan <sha> [--json]` - capture a landed task's baseline (prompt + cost/turns/churn) and emit a one-delta experiment brief to `~/.ax/dojo/spar/<id>.md`; prints the `git worktree add` command that pins the parent SHA.
- `ax dojo spar-score <id> [--variant-session=<id>] [--json]` - score the agent's variant session against the frozen baseline; auto-discovers the variant session from the worktree cwd or accepts an explicit id; writes the receipt to `~/.ax/dojo/spar/<id>-report.md`. Stamps the variant session `labels=["spar"]` so it is excluded from `ax skills weighted` and `ax thinking` (behavioral analytics); it remains in cost analytics.
- `ax digest [--json|--refresh]` - your ranked digest board (improve proposals, routing savings, repair-loops, quota), written to ~/.ax/digest.json every ingest and pushed into the agent's context at session start by the surface-digest hook so value arrives without being asked for
- `ax usage [--json] [--days=N]` - your ax utilization: total runs, active days, top commands by count, agent-vs-tty split, and the never-used surface (visible commands with zero invocations in the window)

### Profile
- `ax profile show [--window=N] [--no-cost]` - your local agent profile rendered from the graph: usage stats, rig (skills/hooks/rules), and evidence-grounded taste patterns
- `ax profile publish` - share it: consent-gated public gist (updated in place by the watcher) + one-time community registration PR; `ax profile unpublish` reverses it
- `ax wrapped generate / publish` - agent-authored Wrapped recap cards: generate emits a mining brief, publish replaces the card deck the dashboard landing serves

### Self-improvement loop
- `ax improve recommend / accept / lint / show` - evidence-backed proposals (guidance, skills, hooks, automations) derived from the graph; accept emits a task brief, lint reconciles applied guidance
- `ax retro emit / pending / brief` + `ax:retro` skill - guided experiment-loop retrospective: triage proposals, lock verdicts, review hook effectiveness
- `ax:dojo` skill - surplus-quota overnight training loop: budget-bounded self-improvement (verdicts, briefs, routing, proposals, experiments); install via `npx skills add Necmttn/ax`
- `ax hooks init / install / backtest / bench / cases` - author typed TypeScript hooks once, run them on Claude Code + Codex; replay history through a hook before installing it
- `ax hooks bench <file> [--days=N] [--runs=N] [--budget-ms=N] [--json]` - latency ledger: per-fire p50/p95 from real bun spawns, est fires/day from tool_call history, installed-chain budget vs --budget-ms default 250
- `ax hooks latency [--days=N] [--baseline=M] [--json]` - regression lens: compare recent (default 7d) vs baseline (default 21d) fire latency per hook event (hook_name is event-granular, e.g. PreToolUse:Bash) from hook_command_invocation telemetry; flags hook events where p95 regressed (factor 1.5, min 15ms delta, min 20 samples)

### Graph insights
- `ax insights <view>` - 31 read-only graph views
- `ax signals list | show` - relation-signal catalog (fragility cascades, etc.)
- `ax sessions metrics`, `ax evidence weekly` - session health and weekly evidence rollups
- `ax serve` - local dashboard with live ingest, session browser, and health views
- `ax tui` - terminal skills browser

### Setup and ingest
- `ax team sync` - activate the team's committed .ax/ rig (skills + agents) into your runtime, trust-gated (executable hooks gated)
- `ax team trust` - review + install executable team hooks from .ax/hooks/ (sha256 trust-on-change, default-branch-only, fail-safe)
- `ax team experiment <start|list|promote|drop> <kind> <name>` - isolate→iterate→promote loop: start copies/scaffolds an artifact into a gitignored .ax.local/ overlay, list shows active experiments, promote moves the overlay artifact into committed .ax/ + git adds it (open a PR after), drop discards the overlay; ax team sync activates the overlay version for you only (annotated as "(experiment)")
- `ax setup` / `ax install` - first-time install: CLI, local SurrealDB, transcript watcher
- `ax ingest [here] [--since=Nd]` - ingest transcripts, skills, and git history; `here` scopes to the current repo; the watcher does this automatically on new transcripts
- `ax roles` - list role labels used for skill classification
- `ax doctor` - validate the install (db, watcher, skills)

### For agents (MCP)
- `ax mcp` - stdio MCP server exposing read-only graph queries as tools: recall, sessions_around, session_show, skills_weighted, skills_by_role, skills_roles, roles, improve_recommend, improve_show, improve_list, session_metrics, signal_show, cost_models, cost_split, dispatches, dojo_agenda

## Q&A - "how do I..." mapped to the ax answer

**Q: What was I working on last week?**
A: `ax sessions here --days=7` - sessions scoped to the current repo. `ax sessions around 2026-06-01` for a date window anywhere.

**Q: How did we ship feature X? I need to reconstruct what actually happened.**
A: `ax sessions near <sha>` finds the sessions behind a commit; the `ax:extract-workflow` skill narrates the ordered steps that produced the artifact.

**Q: Where did we discuss/decide Y? I remember it but can't find it.**
A: `ax recall "Y"` - full-text search across every turn, commit, and skill, across all five harnesses.

**Q: What did that refactor / branch / PR cost?**
A: `ax costs for --branch <name>` or `--commit <sha>` or `--query "<text>"` - tokens and USD, priced per model with cache buckets.

**Q: Where does my model spend actually go?**
A: `ax cost split --days=7` - main loop vs subagents, by model. Usually reveals the frontier model burning cache reads on mechanical work.

**Q: My frontier-model bill is high. What can run on cheaper models?**
A: `ax dispatches --candidates` - every subagent dispatch that inherited an expensive model while doing mechanical work, with estimated savings repriced from real token buckets.

**Q: How do I make my agent pick cheaper models automatically?**
A: Three layers: the `efficient-dispatch` skill (orchestration pattern), the `route-dispatch` hook (`ax hooks install ~/.ax/hooks/route-dispatch.ts --providers=claude` - warns at dispatch time), and `ax dispatches compile-routing` (regenerates the routing table both read).

**Q: I installed 100 skills. Which ones actually earn their keep?**
A: `ax skills weighted` - usage x role-weight ranking. `ax skills unused` for the dead weight; `ax skills config` to park or retire.

**Q: Which skills work well together?**
A: `ax skills pairs` - co-occurrence across real sessions.

**Q: Did that hook / guardrail / guidance actually help, or does it just feel like it did?**
A: The experiment loop: `ax improve recommend` proposes changes with evidence, accepting stamps the moment, and `ax:retro` walks verdicts - worked, didn't, no-longer-needed. Hook effectiveness has its own signal.

**Q: Can I test a hook against history before trusting it?**
A: `ax hooks backtest <file> --days=14` - replays your real tool-call history through the hook in-process and reports would-block/would-warn rates.

**Q: How do I write one guard that runs on both Claude Code and Codex?**
A: `ax hooks init` scaffolds `~/.ax/hooks` (typed Effect TypeScript, `defineHook`); `ax hooks install <file> --providers=claude,codex` fans it out to both configs.

**Q: Can my agent query its own history mid-session, without me prompting it?**
A: `ax mcp` - a stdio MCP server exposing the read-only graph queries as 16 tools. The agent asks "where is the spend" or "what did we try before" itself.

**Q: How do I show a session to a teammate?**
A: `ax share <session-id>` - sanitized share via GitHub Gist, rendered in the hosted studio viewer.

**Q: I'd rather click than type.**
A: `ax serve` - local dashboard: live ingest, session browser, health views. `ax tui` for the terminal skills browser.

**Q: What should I improve in my setup right now?**
A: Say "let's do an ax retro" - proposal triage + verdict review against your recent work. `ax improve recommend` for the raw shortlist.

**Q: Is my install healthy / why is nothing ingesting?**
A: `ax doctor` - validates DB, watcher, and skills. The watcher ingests new transcripts automatically; `ax ingest here --since=1` forces a scoped pass.

## Q&A - multi-hop graph questions (where the graph beats grep)

These need relations chained across sessions, commits, files, and verdicts -
derived at ingest, queryable in one command.

**Q: Which of my past mistakes are still costing OTHER work?**
A: `ax signals show fragility_cascade` - sessions whose reverted commits' files later forced edits by *other* sessions. Three hops: origin session -> reverted commit -> touched files -> downstream sessions that had to fix them, weighted by distinct downstream fixers.

**Q: Which features needed fixes after shipping?**
A: `ax insights post-feature-fixes` - feature commits joined to the fix commits that followed them. Pair with `ax costs for --commit` to price the cleanup.

**Q: What do I correct my agent about most often?**
A: `ax insights friction` for the correction/interruption hotspots; `ax insights reaction-themes` clusters classified user reactions into recurring themes - the raw material the improve loop mines.

**Q: Where does my agent claim "done" without verifying?**
A: `ax insights verification-gaps` - completion claims with no verification evidence behind them.

**Q: Which sessions burned tokens without producing anything?**
A: `ax insights token-impact` and `ax insights cache-health` - token spend joined to outcome evidence, cache-read ratios flagging context-thrash sessions.

**Q: Which skills rescue sessions that go wrong?**
A: `ax skills recovery` - failure -> recovery arcs: what got invoked after things broke, and whether the session recovered.

**Q: How does ax decide WHAT to propose improving?**
A: The derive pipeline chains hops at ingest: classified reactions + correction contexts + skill candidates + dispatch analytics flow into `proposal` rows with evidence refs. `ax improve recommend` ranks them (confidence x recency x frequency); accepting stamps the moment; `ax:retro` verdicts close the loop. Improvement is graph-derived, not vibes.

**Q: My question isn't on this list - can I ask the graph directly?**
A: Yes. The whole graph is local SurrealDB (`127.0.0.1:8521`, schema in the repo); `ax insights <view>` covers 31 prebuilt traversals, and via `ax mcp` an agent can chain recall -> sessions_around -> session_show hops itself to answer novel multi-hop questions in-context.

## Pages
- [How it works](https://ax.necmttn.com/how-it-works): ingest -> graph -> evidence loop
- [Features](https://ax.necmttn.com/features): feature tour
- [CLI reference](https://ax.necmttn.com/docs/cli-reference): every command
- [Docs](https://ax.necmttn.com/docs): concepts and language
- [Manifesto](https://ax.necmttn.com/manifesto): why the agent experience layer exists
- [Changelog](https://ax.necmttn.com/changelog): release narratives
- [Registry](https://ax.necmttn.com/registry): community registry direction