Smart Memory System v9.1: AI That Remembers

The Stateless Problem

By default, every AI conversation starts blank. The model has no memory of your previous sessions, your preferences, your context, or your history. Each time you start a new session, you are starting from zero.

For simple chat use cases, this is acceptable. For an AI workforce that operates over days, weeks, and months — it is a fundamental limitation.

An agent that cannot remember what it has already researched will re-research it. An agent that cannot remember your preferences will need to be told them every session. An agent that cannot carry context between tasks cannot do work that spans multiple sessions.

Smart Memory fixes this.

Two-Layer Architecture

The Smart Memory System v9.1 operates in two layers:

Layer 1: PocketBase Structured Memory

Typed memory records stored in PocketBase with full query capability:

Factual memories — specific facts, research findings, data points
Preference memories — user preferences, style guides, workflow preferences
Contextual memories — project context, ongoing work, current priorities
Agent memories — what each agent has done, learned, and decided

Each memory record has: content, type, importance score, created_at, last_accessed_at, and the agent that created it.

Layer 2: Filesystem Archive

Long-form content that does not fit in structured records — full research reports, extended notes, task histories — stored as Markdown files on the filesystem. Indexed in PocketBase for searchability, but the content lives in files that are fast to read and easy to inspect or edit manually.

How Memory Is Used

When a cascade step invokes an agent, the memory system runs a relevance retrieval before the model call:

The prompt and current context are vectorised (or keyword-matched in simpler mode)
The memory store is searched for records with high relevance to the current task
The top N memories are injected into the agent's context window
The model call includes: current prompt + retrieved memories + active laws

After the model responds, the memory system evaluates the response for new memories worth storing:

New facts learned
New preferences expressed
New context established
Decisions made that should be recalled

These are written back to the memory store with appropriate type and importance scoring.

Memory Scoping

Memory is scoped at multiple levels:

User memories — tied to your account. Your preferences, your project context, your personal research history. Visible to all your agents.

Agent memories — tied to a specific persona. The Librarian's research findings, the Linter's style notes, the Cipher's security observations. Each agent builds a specialised memory that does not pollute other agents' contexts.

Session memories — tied to a specific cascade run or conversation. Temporary context that is relevant only within a session, not persisted long-term.

Global memories — shared across the platform instance. Team-wide facts, shared project context, common knowledge that all agents should have access to.

Importance Scoring and Pruning

Not every piece of information is equally worth remembering. The memory system assigns importance scores to memories based on:

How often the memory has been accessed
How recently it was accessed
Whether it was flagged as high-importance at creation
Whether it was used in a decision that was subsequently validated

Memories below a threshold score are pruned in weekly maintenance runs. The system does not grow unbounded — it prioritises what is actually useful and clears what is not.

You can also manually manage memories: browse the memory store, edit records, force-archive memories that are no longer relevant, or flag memories as permanent (never pruned).

Memory in Practice

Research continuity — The Librarian remembers what it has already researched. When you ask for a briefing on a topic it has covered before, it retrieves its previous findings, identifies the gap between what it knows and what is current, and only researches what has changed.

Style consistency — If you tell an agent "our brand voice is direct and technical, no filler phrases", that preference is stored. Every subsequent cascade for that agent respects it without you restating it.

Project continuity — A cascade monitoring a competitor can store "as of last week, their pricing was X." Next week, it retrieves this fact, compares it to the current crawl, and accurately reports "pricing changed from X to Y" rather than just "current pricing is Y."

Cross-session task tracking — Agents can store partial work: "I have completed steps 1–3 of this analysis; steps 4–5 remain." If the session ends before the task is complete, the next run picks up from where the previous one left off.

Version History

The memory system has evolved significantly. v9.1 introduces:

Relevance-scored retrieval — moved from recency-only to weighted relevance + recency hybrid
Cross-agent memory sharing — agents can explicitly share memories to other personas
Memory diff tracking — when a memory is updated, the previous version is archived rather than deleted
Pruning controls — configurable pruning thresholds per memory type

Earlier versions (v1–v8) iterated through basic persistence, then structured typing, then indexing, then importance scoring. v9.x is the first version that operates reliably at scale with hundreds of active memories without context window bloat.

Your agents do not just work. They learn. And they remember.