Memory that lives

Seven layers, from raw log to abstract concept

Most harnesses remember with a flat log that only grows. Xihe’s memory is a graph that distils, reorganises, and forgets — so what surfaces is signal, not history.

One store, four inhabitants

It all lives in a single file

Skills, genes, facts, and evolution events are usually four systems. Here they are four registers of one SQLite database — with bridges between them and a flywheel that mutates them. No embeddings, no GPU, no vector service.

substrate.db
skills what the agent can do
genes reusable how-to
facts what it has learned
evolution every change, audited
one fileauditablemigratableno embeddings · no GPU
open it yourself
$ cp substrate.db backup.db
$ sqlite3 substrate.db \
   "select layer, key, confidence from facts limit 5"

Most memory is an opaque vector index you take on faith. This is a table you can query in the tool you already have.

L6 → L0

Seven layers, from raw log to abstract concept

L6

Abstract concepts

Symbolic reasoning primitives

L5

Strategic patterns

Cross-task reusable strategies

L4

Behavioural genes

BM25-retrieved, idempotently migrated

L3

Entity graph

Temporal knowledge graph with provenance

L2

Structured facts

Validated, restraint-gated writes

L1

Compressed episodes

Dream-pruned clusters

L0

Raw logs

Ephemeral, uncurated

abstract · summit of distillation raw · floor of the running log

Lifecycle · three gates

A fact’s life, in three gates

I

Write

restraint gate + validator

Every fact must survive falsifiability checks before it persists. Hallucinations never become memory.

II

Curate

dream · purify · cluster

While the agent is idle, memory dedupes, compresses, and reorganises itself. Entropy reduction keeps retrieval sharp.

III

Retire

prune + decay

Recency, confidence, and relevance drive retirement. Intentional forgetting is a feature, not a leak.

You don’t curate the memory. The memory curates itself.

What may be written

A reject-by-default membrane

Not everything an agent says deserves to be remembered. Every write passes validateFactWrite + restraint gates; only declared namespaces get through, and the gate stays shut unless a fact proves it belongs.

user.* who you are preference · interest · focus · expertise · trait · goal
wright.* the agent about itself capability · pattern · limit
domain.* your field, injected a domain pack declares its own
secrets secrets — scrubbed, never stored
a fact that passes user.preference · prefers Bun over Node

Hallucinations never become memory. The gate is the difference between a chat log and a knowledge base.

From tentative to trusted

A fact earns the right to act

A new fact enters tentative — remembered, but inert. It is promoted to confident only by corroboration: evidence seen across sessions. And only confident facts are injected back to steer behaviour.

tentative

remembered, never acts · expires ~30d

evidence ≥ 3 · across ≥ 2 sessions
confident

drives behaviour · re-injected each turn

user.preference · prefers Bun over Node tentative confident

Layer is one axis — how abstract. Trust is another — how earned. A fact can sit high and still be cold; only the hot, high-trust region touches behaviour.

The agent holds its opinions loosely until the evidence earns them. Circuit-breakers stop a fact flickering or stampeding into trust.

Genes

Genes — learn once, apply everywhere

A behavioural gene is a reusable fragment of how-to. BM25 search finds the closest match to the task at hand and migrates it in idempotently — no retraining, no drift, no waste.

Reuse over recall. The best memory is the one you never have to look up twice.

task match · BM25 gene gene gene

Skills that learn themselves

The substrate compounds

The same store that remembers also improves itself. Runtime signals are mined into capability and written back — so the harness gets sharper every run instead of starting cold. The agent proposes; you approve. No silent self-modification.

01
Signals

route_hit + five producers fire from the run

five signal producers tool_failure · recall_miss · user_correction · hard_problem · drift
02
Mine

an episodic miner distils confident + worked patterns

03
Candidate

a new skill or gene is drafted

04
Curate

dedup · prune · shrink — the bundle never bloats

05
Gate

held-out gates · propose-only · you approve

It loops back: confident patterns become genes, genes become skills, skills feed the next run. A flywheel, not a fixed prompt.

The curated core

The curated core: twelve universal skills

start restore context, brief the session
  • Parallel-reads _NEXT + git status to rebuild last context
  • Outputs a briefing: the current task and the suggested next step
  • The first command of every session — guards against context loss
handoff session wrap: state + memory capture
  • Updates the active-plan pickup anchor before you leave
  • Writes a session log and captures durable memory
  • Skip it and the next session starts blind
commit verified, conventional git commits
  • Analyses the diff and runs the tsc / test / build gate first
  • Writes a conventional message; never commits red
  • --ship mode: merge base, full tests, then opens a PR
verify tsc · test · build gate
  • Sizes the check to the files you actually touched
  • Five modes: smart / quick / full / all / health
  • The single gate everything passes before it counts
investigate eight-phase root-cause debugging
  • Iron law: no fix without a proven root cause
  • History search → reproduce → scope-lock → hypothesis → fix
  • Searches past incidents before investigating from scratch
recall vector recall of past decisions
  • Searches a vector library of past decisions and lessons
  • For when you are stuck or sense there is a precedent
  • Orthogonal to grep: recalls history, not current code
review on-demand security / coverage / debt audit
  • Dispatches specialist reviewers plus a cross-model lens
  • Security, coverage, tech-debt, full gate, or PR review
  • Run at phase milestones and before a merge
caveman ultra-compressed, token-lean comms
  • Drops filler, articles, and pleasantries — keeps the facts
  • About 75% fewer tokens at full technical accuracy
  • Toggle on when you want terse over polished
council rival drafts, judged to one winner
  • N personas generate rival solutions in parallel
  • A multi-lens judge panel scores and picks the winner
  • Grafts the runners-up’s best ideas into the result
retro engineering retrospective from git history
  • Reads commit patterns: feat / fix / refactor mix, focus score
  • Test discipline and the productivity trend over time
  • For sprint and weekly reviews, not session wrap
dream distil raw events into layered memory
  • Consolidates raw events into L0–L6 layered memory
  • Every write passes restraint gates and a validator
  • Runs the full stack within a bounded budget
skill-creator author & evolve skills
  • Create a skill from scratch or improve an existing one
  • Run evals and benchmark triggering accuracy with variance
  • Optimise a skill’s description for better routing

open any skill for its core points

no platform deps · no brand · no private keys

Domain grounding

Teach it your field — answers arrive with sources

The universal harness binds to no domain. But memory is built to take a domain pack: a set of namespaces, validators, and a grounding lexicon for your personal or professional field. Install one, and the model stops guessing where it matters — its answers carry provenance you can check.

α

Injectable namespaces

A domain pack declares what may be remembered in your field, and what never can. The write-gate enforces it, reject-by-default.

β

Grounded, sourced answers

The grounding gate watches for claims that need authority and binds them to your domain lexicon — statutory numbers, field terms, the facts that must not be hallucinated.

γ

Yours stays yours

The domain pack carries your logic, your terms, your sources. The open substrate never absorbs them — swap fields without touching the core.

Universal core, domain on top. The model answers from your ground, not its guess.