The Stateless Problem
By default, every AI conversation starts blank. The model has no memory of your previous sessions, your preferences, your context, or your history. Each time you start a new session, you are starting from zero.
For simple chat use cases, this is acceptable. For an AI workforce that operates over days, weeks, and months โ it is a fundamental limitation.
An agent that cannot remember what it has already researched will re-research it. An agent that cannot remember your preferences will need to be told them every session. An agent that cannot carry context between tasks cannot do work that spans multiple sessions.
Smart Memory fixes this.
Two-Layer Architecture
The Smart Memory System v9.1 operates in two layers:
Layer 1: PocketBase Structured Memory
Typed memory records stored in PocketBase with full query capability:
- Factual memories โ specific facts, research findings, data points
- Preference memories โ user preferences, style guides, workflow preferences
- Contextual memories โ project context, ongoing work, current priorities
- Agent memories โ what each agent has done, learned, and decided
Each memory record has: content, type, importance score, created_at, last_accessed_at, and the agent that created it.
Layer 2: Filesystem Archive
Long-form content that does not fit in structured records โ full research reports, extended notes, task histories โ stored as Markdown files on the filesystem. Indexed in PocketBase for searchability, but the content lives in files that are fast to read and easy to inspect or edit manually.
How Memory Is Used
When a cascade step invokes an agent, the memory system runs a relevance retrieval before the model call:
- The prompt and current context are vectorised (or keyword-matched in simpler mode)
- The memory store is searched for records with high relevance to the current task
- The top N memories are injected into the agent's context window
- The model call includes: current prompt + retrieved memories + active laws
After the model responds, the memory system evaluates the response for new memories worth storing:
- New facts learned
- New preferences expressed
- New context established
- Decisions made that should be recalled
These are written back to the memory store with appropriate type and importance scoring.
Memory Scoping
Memory is scoped at multiple levels:
User memories โ tied to your account. Your preferences, your project context, your personal research history. Visible to all your agents.
Agent memories โ tied to a specific persona. The Librarian's research findings, the Linter's style notes, the Cipher's security observations. Each agent builds a specialised memory that does not pollute other agents' contexts.
Session memories โ tied to a specific cascade run or conversation. Temporary context that is relevant only within a session, not persisted long-term.
Global memories โ shared across the platform instance. Team-wide facts, shared project context, common knowledge that all agents should have access to.
Importance Scoring and Pruning
Not every piece of information is equally worth remembering. The memory system assigns importance scores to memories based on:
- How often the memory has been accessed
- How recently it was accessed
- Whether it was flagged as high-importance at creation
- Whether it was used in a decision that was subsequently validated
Memories below a threshold score are pruned in weekly maintenance runs. The system does not grow unbounded โ it prioritises what is actually useful and clears what is not.
You can also manually manage memories: browse the memory store, edit records, force-archive memories that are no longer relevant, or flag memories as permanent (never pruned).
Memory in Practice
Research continuity โ The Librarian remembers what it has already researched. When you ask for a briefing on a topic it has covered before, it retrieves its previous findings, identifies the gap between what it knows and what is current, and only researches what has changed.
Style consistency โ If you tell an agent "our brand voice is direct and technical, no filler phrases", that preference is stored. Every subsequent cascade for that agent respects it without you restating it.
Project continuity โ A cascade monitoring a competitor can store "as of last week, their pricing was X." Next week, it retrieves this fact, compares it to the current crawl, and accurately reports "pricing changed from X to Y" rather than just "current pricing is Y."
Cross-session task tracking โ Agents can store partial work: "I have completed steps 1โ3 of this analysis; steps 4โ5 remain." If the session ends before the task is complete, the next run picks up from where the previous one left off.
Version History
The memory system has evolved significantly. v9.1 introduces:
- Relevance-scored retrieval โ moved from recency-only to weighted relevance + recency hybrid
- Cross-agent memory sharing โ agents can explicitly share memories to other personas
- Memory diff tracking โ when a memory is updated, the previous version is archived rather than deleted
- Pruning controls โ configurable pruning thresholds per memory type
Earlier versions (v1โv8) iterated through basic persistence, then structured typing, then indexing, then importance scoring. v9.x is the first version that operates reliably at scale with hundreds of active memories without context window bloat.
Your agents do not just work. They learn. And they remember.