harness ☉ · anatomy

part by part

the system around the model

· eight parts

Anatomy of the harness

Reliability is engineered, part by part

None of this lives inside the model. It’s the system around it, and the sum is an agent you can hand real work to.

eight systems

The harness, taken apart piece by piece

Orchestration

conductor · dispatch · executor

Roles decoupled from models. A conductor plans the DAG, deterministic dispatch drives it, executors run leaves in parallel. Detailed below.

DAG executor

atomic leaves · CAS · bounded concurrency

Work compiles to a directed graph of atomic leaves. Compare-and-swap transitions keep state honest under parallelism; concurrency is tuned per model so nothing thrashes.

III

Memory & Dream

temporal KG · consolidation · restraint gates

Raw events are distilled into layered memory by a “dream” pass. Every fact write passes restraint gates and a validator before it is allowed to persist.

Skills substrate

skills · genes · facts on SQLite

One substrate unifies capability, behavioural fragments, and learned facts: searchable, migratable, and able to feed each other. Detailed below.

MCP router

one route · menu resident · schema hidden

All Model-Context-Protocol tools converge on a single router. Schemas load on demand instead of bloating every prompt, for large, sustained token savings.

Plan mode

read-only deliberation cockpit

A pre-flight mode that can read and reason but not write: source-grounded planning, a tool-call gate, and a ledger re-examined each turn before any change is made.

VII

Web stack

search pool · clean extraction

A rotating pool of search providers plus content extraction that strips pages to signal. Research and retrieval without hand-feeding URLs.

VIII

Contracts & guardrails

types · validation · state machine · acceptance

Invariants are pinned as falsifiable contracts. Implementation that conflicts with a contract loses; a contract proven wrong by the code yields, on the record.

intent

conductor → DAG

dispatch drives

executors → leaves

memory

Pain → solution

Six ways agents break, engineered shut

Every weakness of a naive agent loop has a structural answer in the harness — not a prompt politely asking the model to behave, but a mechanism that makes misbehaviour impossible.

01 Agent flows are tangled spaghetti

A DAG slices intent into atomic leaves, each with a compare-and-swap guarantee and a postcondition it must assert before it counts.

One leaf, one fault boundary.

02 It invents symbols that don’t exist

Every named field, table, or symbol is checked against a live LSP and a code graph before it is written down. Evidence routing maps structure, then confirms ground truth — identifiers are never guessed.

No hallucinated identifiers.

03 It states facts without checking

A grounding gate watches claims that need authority and binds them to a source; a domain lexicon makes statutory or field facts impossible to invent.

Claims carry provenance.

04 Weak models introduce line-shift errors

Hashline edits verify each patch at the line level; an LSP feeds real-time symbols and types. A misplaced edit is caught and corrected, never committed.

Surgical patches, not rebuilds.

05 It spins in place, going nowhere

A drift detector watches for repetition, classifies the wander, and pulls the run back to the last good anchor before tokens burn on a loop.

Caught, classified, rewound.

06 Agents declare a success they never earned

Every leaf must assert its postcondition before it counts; high-risk seams face adversarial verifiers whose job is to refute the result. A failed check loops back, never silently past.

Green is not the same as right.

Four powers, one task

Roles decoupled from models

Each of the four powers has its own charge. None of them is bound to a single model; that binding is a runtime choice, made fresh per task.

Conductor

Plans & decomposes

Breaks intent into an atomic DAG of leaves and decides what runs, in what order, and what each leaf must prove. Reasoning-heavy — the design brain. Swappable to whatever plans best: Opus, Codex, GLM.

Opus · Codex · GLM

Dispatch

Sequences & gates

Pulls ready leaves off the DAG, layers them topologically, and fans them out under bounded concurrency — compare-and-swap keeping state honest the whole way. Deterministic harness code, no model in the loop. The runtime that refuses to let drift past.

deterministic · no model

Executor

Does the work

Runs a single leaf — the atom of work: one slice of code, edits, or tool calls, with its own fault boundary. Cheap and massively parallel: DeepSeek 256-wide, MiMo 8-wide for multimodal, concurrency tuned per model so nothing thrashes.

MiMo · DeepSeek

Verifier

Reviews & refutes

A cross-model skeptic — any provider you point it at, never bound to one — whose job is to attack the result, not bless it. It fires where green is not enough: high-risk seams, contract boundaries, phase ends. Find a real flaw and the conductor silently escalates to a stronger model and re-plans; otherwise the leaf counts. Off by default, one round, capped — a lens, not a ritual.

model-agnostic · cross-model

One executor, three kinds of leaf

The conductor tags every node with a kind; dispatch fans a layer out side by side, and each leaf runs its own way — the cheapest reliable call is the one no model has to make.

inproc default

A single model call — pure text generation, no tools.

In-process through the parallel primitive; never touches the database.

edits files · no leafModel · DeepSeek 256-wide · MiMo

agent tools + writes

A tool-bearing sub-agent — multi-turn, can call tools.

The configured agent runner drives a full sub-loop.

edits files · yes agent sub-loop · MiMo · DeepSeek

command zero LLM

A deterministic CLI — no model at all.

The command runner executes the node’s command; exit 0 means done.

edits files · yes none · pure shell

Safe by construction: an agent leaf with no runner degrades to inproc — no tools, no writes — and warns; a command leaf with nothing to run fails outright, never silently.

↩a failed check loops back to the conductor, never silently past it

Hooks & seams

Built on pi events — injectable, configurable

Reliability you can extend. Three hooks ride pi’s event stream; five more are seams woven through the harness — each a place your own code, config, or domain pack can plug in.

pi-event hooks

toolGate

A tool-execution gate: a dangerous-command guard plus an allowed-tools whitelist. Nothing runs that you didn’t sanction.

driftDetector

A metacognitive safety net. Detects spinning-in-place → onSpinning; escapes a stuck loop → onRecovered. Tunable threshold and re-injection.

dangerousCmd

Guards irreversible operations — push --force, rm -rf — before they ever reach the shell.

wright injectable seams

memory write-gate

validateFactWrite + namespaces, reject-by-default. A domain pack overrides what is allowed to persist.

grounding gate

Observes message output; an injectable lexicon + action. A domain injects its statutory wordlist so answers carry sources.

signal producers

onFailure / onMiss / onCorrection / onSpinning / onRecovered / onGrounded — wire runtime signals into your own sink.

role-model seam

resolveRoleModel + .wright/config.json: conductor, executor, leaf, and dream each bind to a model you can swap.

MCP routing

/mcp add registers a server at runtime; mcp_search routes to it. New tools without touching the gateway.

Concept here · file-level detail lives in the docs.