Memory that lives

Seven layers, from raw log to abstract concept

Most harnesses remember with a flat log that only grows. Xihe’s memory is a graph that distils, reorganises, and forgets — so what surfaces is signal, not history.

One store, four inhabitants

It all lives in a single file

Skills, genes, facts, and evolution events are usually four systems. Here they are four registers of one SQLite database — with bridges between them and a flywheel that mutates them. No embeddings, no GPU, no vector service.

substrate.db

skills what the agent can do

genes reusable how-to

facts what it has learned

evolution every change, audited

one fileauditablemigratableno embeddings · no GPU

open it yourself

$ cp substrate.db backup.db
$ sqlite3 substrate.db \
   "select layer, key, confidence from facts limit 5"

Most memory is an opaque vector index you take on faith. This is a table you can query in the tool you already have.

L6 → L0

Seven layers, from raw log to abstract concept

Abstract concepts

Symbolic reasoning primitives

Strategic patterns

Cross-task reusable strategies

Behavioural genes

BM25-retrieved, idempotently migrated

Entity graph

Temporal knowledge graph with provenance

Structured facts

Validated, restraint-gated writes

Compressed episodes

Dream-pruned clusters

Raw logs

Ephemeral, uncurated

▲ abstract · summit of distillation raw · floor of the running log ▼

Lifecycle · three gates

A fact’s life, in three gates

Write

restraint gate + validator

Every fact must survive falsifiability checks before it persists. Hallucinations never become memory.

Curate

dream · purify · cluster

While the agent is idle, memory dedupes, compresses, and reorganises itself. Entropy reduction keeps retrieval sharp.

III

Retire

prune + decay

Recency, confidence, and relevance drive retirement. Intentional forgetting is a feature, not a leak.

You don’t curate the memory. The memory curates itself.

What may be written

A reject-by-default membrane

Not everything an agent says deserves to be remembered. Every write passes validateFactWrite + restraint gates; only declared namespaces get through, and the gate stays shut unless a fact proves it belongs.

user.* who you are preference · interest · focus · expertise · trait · goal

wright.* the agent about itself capability · pattern · limit

domain.* your field, injected a domain pack declares its own

secrets secrets — scrubbed, never stored

a fact that passes user.preference · prefers Bun over Node

Hallucinations never become memory. The gate is the difference between a chat log and a knowledge base.

From tentative to trusted

A fact earns the right to act

A new fact enters tentative — remembered, but inert. It is promoted to confident only by corroboration: evidence seen across sessions. And only confident facts are injected back to steer behaviour.

tentative

remembered, never acts · expires ~30d

→ evidence ≥ 3 · across ≥ 2 sessions

confident

drives behaviour · re-injected each turn

user.preference · prefers Bun over Node tentative → confident

Layer is one axis — how abstract. Trust is another — how earned. A fact can sit high and still be cold; only the hot, high-trust region touches behaviour.

The agent holds its opinions loosely until the evidence earns them. Circuit-breakers stop a fact flickering or stampeding into trust.

Genes

Genes — learn once, apply everywhere

A behavioural gene is a reusable fragment of how-to. BM25 search finds the closest match to the task at hand and migrates it in idempotently — no retraining, no drift, no waste.

Reuse over recall. The best memory is the one you never have to look up twice.

Skills that learn themselves

The substrate compounds

The same store that remembers also improves itself. Runtime signals are mined into capability and written back — so the harness gets sharper every run instead of starting cold. The agent proposes; you approve. No silent self-modification.

Signals

route_hit + five producers fire from the run

five signal producers tool_failure · recall_miss · user_correction · hard_problem · drift

Mine

an episodic miner distils confident + worked patterns

Candidate

a new skill or gene is drafted

Curate

dedup · prune · shrink — the bundle never bloats

Gate

held-out gates · propose-only · you approve

↻It loops back: confident patterns become genes, genes become skills, skills feed the next run. A flywheel, not a fixed prompt.

The curated core

The curated core: twelve universal skills

start restore context, brief the session

Parallel-reads _NEXT + git status to rebuild last context
Outputs a briefing: the current task and the suggested next step
The first command of every session — guards against context loss

handoff session wrap: state + memory capture

Updates the active-plan pickup anchor before you leave
Writes a session log and captures durable memory
Skip it and the next session starts blind

commit verified, conventional git commits

Analyses the diff and runs the tsc / test / build gate first
Writes a conventional message; never commits red
--ship mode: merge base, full tests, then opens a PR

verify tsc · test · build gate

Sizes the check to the files you actually touched
Five modes: smart / quick / full / all / health
The single gate everything passes before it counts

investigate eight-phase root-cause debugging

Iron law: no fix without a proven root cause
History search → reproduce → scope-lock → hypothesis → fix
Searches past incidents before investigating from scratch

recall vector recall of past decisions

Searches a vector library of past decisions and lessons
For when you are stuck or sense there is a precedent
Orthogonal to grep: recalls history, not current code

review on-demand security / coverage / debt audit

Dispatches specialist reviewers plus a cross-model lens
Security, coverage, tech-debt, full gate, or PR review
Run at phase milestones and before a merge

caveman ultra-compressed, token-lean comms

Drops filler, articles, and pleasantries — keeps the facts
About 75% fewer tokens at full technical accuracy
Toggle on when you want terse over polished

council rival drafts, judged to one winner

N personas generate rival solutions in parallel
A multi-lens judge panel scores and picks the winner
Grafts the runners-up’s best ideas into the result

retro engineering retrospective from git history

Reads commit patterns: feat / fix / refactor mix, focus score
Test discipline and the productivity trend over time
For sprint and weekly reviews, not session wrap

dream distil raw events into layered memory

Consolidates raw events into L0–L6 layered memory
Every write passes restraint gates and a validator
Runs the full stack within a bounded budget

skill-creator author & evolve skills

Create a skill from scratch or improve an existing one
Run evals and benchmark triggering accuracy with variance
Optimise a skill’s description for better routing

open any skill for its core points

no platform deps · no brand · no private keys

Domain grounding

Teach it your field — answers arrive with sources

The universal harness binds to no domain. But memory is built to take a domain pack: a set of namespaces, validators, and a grounding lexicon for your personal or professional field. Install one, and the model stops guessing where it matters — its answers carry provenance you can check.

Injectable namespaces

A domain pack declares what may be remembered in your field, and what never can. The write-gate enforces it, reject-by-default.

Grounded, sourced answers

The grounding gate watches for claims that need authority and binds them to your domain lexicon — statutory numbers, field terms, the facts that must not be hallucinated.

Yours stays yours

The domain pack carries your logic, your terms, your sources. The open substrate never absorbs them — swap fields without touching the core.

Universal core, domain on top. The model answers from your ground, not its guess.