ContextBank vs Vector Databases: Why Memory Is Not Embeddings

What ContextBank is designed to solve

A production AI agent runs on a focused set of tasks that have to be tracked over a long period of time. The work the agent is doing is often bigger than the context window, the file, or the dataset it has access to. The agent needs to keep the exact right information present at the exact right moment, and that information has to survive when the task it serves is bigger than any single prompt can hold.

ContextBank is the system that solves that problem. It is a thread-safe, persistent, page-key-partitioned memory layer designed for production agent state. It approaches semantic meaning the way humans and LLMs actually process it: through weighted lorebook activation by substring match. A lorebook entry fires when a specific substring is present in the agent’s context, and the weight and gating controls (linked keys, alias keys, required keys) keep false hits rare even under heavy memory pressure. The correct slice of data survives across turns, across agents, and across the full length of a task that exceeds the window.

The mechanism is structurally aligned with how the LLM itself works. LLMs output text by passing prompts and context forward over a series of steps in some orchestration pattern: Manifold, Junction, DistributionGrid, or any other container. The same token stream that the LLM is producing and consuming is the same string that substring activation is matching against. The abstract semantic meaning and the connections between concepts live the same way in human language and in LLM token streams, so a substring match against a weighted, well-keyed lorebook entry is mathematically likely to land on the exact correct values in memory when the lorebook and the context bank are written with care. The retrieval mechanism is not layered on top of the agent’s token stream. It reads out of the same stream that the agent is already producing and consuming.

Ten Trillion Triangles TPipe was built around that alignment. The memory system is what makes 120-turn agents, persistent lorebook state, and long-horizon task execution tractable. The substrate is the layer that decides which substring triggers fire, which entries get gated, and which page-key partitions get retrieved. That substrate decision is the architectural fact that distinguishes the agent memory layer from the document retrieval layer, and the substrate decision is itself informed by how the LLM token stream is going to flow through the orchestration pattern that surrounds it.

What vector databases are designed to solve

Vector databases long predate LLMs as production deployment targets. They were built for document retrieval, semantic search, and recommendation systems. They take an embedding of a query and return the top-k nearest neighbors in a learned representation space. Cosine similarity, dot product, or Euclidean distance — the math is the same regardless of use case.

The mechanism is built for the problem it was designed to solve. A corpus of PDFs, support tickets, or knowledge-base articles is too large to keep in memory. Approximate semantic similarity over a high-dimensional float space finds the relevant documents in milliseconds. The retrieval is “everything that matches the query in embedding space, ranked by similarity.”

The mechanism is also exactly the wrong shape for what production agents need, for three reasons.

First, vector search is binary in the wrong direction. It returns “everything that matches, ranked by similarity” — but the agent does not need everything that matches. The agent needs the specific slice of state that the current task requires, gated by the relationships and dependencies that the lorebook encoding captures. A vector index has no concept of partial gating, no concept of “this entry fires only when this other entry is also present.” The semantic-similarity retrieval shape that vector databases were built for returns a flat ranked list of everything that scored above threshold, with no way to express the dependency graph that a real agent task requires.

Second, the substrate that decides what the agent sees cannot be a vector index if the goal is production safety. A vector index that returns the top-k nearest neighbors is a “show everything” surface. A weighted lorebook entry that fires only when its gating conditions are met is a “show exactly this” surface. The latter is a control surface. The former is an index. Production agent systems at Ten Trillion Triangles need control.

Third, vector search does not approach semantic meaning the way humans and LLMs do. Embeddings compress meaning into a fixed-dimensional float vector that loses the structural relationships between concepts. Substring activation over a named lorebook entry preserves the structural relationships explicitly: the entry has a name, a weight, linked keys that fire with it, alias keys that activate the same entry under different triggers, and required keys that must be present for the entry to fire at all. That structural encoding is what allows the memory layer to behave predictably under pressure, to be scripted by developers, and to be introspected by agents that were not designed when the entry was written.

The category error that ships most often in production agent systems is reaching for a vector database to solve the ContextBank problem. The two abstractions are not interchangeable. They solve different problems, and the failure mode of using the wrong one is the same every time: an agent that retrieves embeddings when it should be reading state, or reads state when it should be retrieving embeddings.

What ContextBank actually does

ContextBank is the agent memory layer. It is implemented as a Kotlin singleton at Context/ContextBank.kt:46. The bank’s storage is a ConcurrentHashMap<String, ContextWindow> (line 58). Every entry is keyed by a string, and every entry holds a ContextWindow containing the state.

The retrieval mechanism is weighted lorebook activation. A lorebook entry has a key, a value, a weight, optional linked keys, optional alias keys, and optional required keys (Context/LoreBook.kt:50-66). The key is a substring trigger. When the agent’s context contains the substring, the entry fires and injects the value into the context window. Linked keys fire together. Required keys must be present for the entry to fire. Weight controls the ordering when multiple entries fire on the same context. The full set of gating controls keeps false hits rare and keeps the correct slice of data present at the exact right moment.

The state is partitioned by page keys. The “story” page holds narrative state. The “world” page holds world rules. The “player-N” page holds per-player state. Each page is independently retrievable, independently writable, and independently mutex-locked. ContextBank.getPageKeys() (line 1198) lists all registered pages. ContextBank.emplaceWithMutex(key, window) writes to a page with mutex serialization. ContextBank.getContextFromBank(key, copy, skipRemote) reads from a page.

The bank is also a developer-facing surface. It is the place where the developer can pre-populate state, dump program data into visible fetchable points, and have those points retrieved automatically by any subsequent agent, including agents that were constructed on the fly after the data was written. The page-key abstraction is the contract: the developer writes to "story" and "world" and "player-1", and any agent that asks for those page keys gets the data without the developer wiring anything further. The agents can introspect on the same data by calling getPageKeys() and reading what is present. Both directions, one contract.

The mutex is the difference between production and prototype. The bank exposes plain emplace (line 414) and mutex-guarded emplaceWithMutex (line 474), updateBankedContextWithMutex (line 924), and swapBankWithMutex (line 960). Concurrent writers using plain emplace write directly to the ConcurrentHashMap and race under load. The mutex-guarded variants serialize access at the page-key level and produce consistent state even when multiple agents write the same page in the same turn. Production code uses the mutex variants. Every example in the Ten Trillion Triangles codebase uses the mutex variants.

// Read from the "story" page
val storyContext = ContextBank.getContextFromBank("story")

// Extract lorebook entries from a new event
for (event in extractedEvents) {
    storyContext.addLoreBookEntry(
        key = event.name,
        value = serialize(event),
        weight = event.significance,
        aliasKeys = event.aliases
    )
}

// Mutex-guarded write — concurrent agents racing on "story" serialize here
ContextBank.emplaceWithMutex("story", storyContext)

Read the call sequence carefully. Read the page. Mutate the lorebook entries in memory. Write the page back under the mutex. The mutex is what makes the write atomic from the perspective of other agents reading the same page. Without the mutex, two agents can read the same page, both add their own lorebook entry, both write back, and one of the entries is silently lost.

The same pattern holds for updateBankedContextWithMutex, which atomically merges new state into an existing page. The merge happens under the mutex. No torn writes, no lost entries, no corrupted state.

The contract ContextBank exposes

ContextBank is a contract. The contract is uniform for both the agent and the developer: read a page key, get a ContextWindow. Write a page key, persist a ContextWindow. Everything else, where the data physically lives, how it is retrieved, how it is written back, what database backs it, whether it goes through a network hop or stays in process memory, is an implementation detail behind the contract.

The implementation surface behind the contract is pluggable. The bank exposes a RetrievalFunction typealias (Context/ContextBank.kt:21) and a WriteBackFunction typealias (Context/ContextBank.kt:31). Any developer can register a function that retrieves a ContextWindow from any source: the file system, a remote MemoryServer running on another node, an external Postgres database, a Pinecone vector index, or any custom backend. The bank calls the registered function by page key. The agent never knows.

// Register a custom retrieval — TPipe calls this when an agent asks for "ext-system"
ContextBank.registerRetrievalFunction("ext-system") { key ->
    val pgRow = postgresClient.fetchOne("SELECT payload FROM ext_state WHERE key = ?", key)
    deserializeContextWindow(pgRow.payload)
}

// Register a custom write-back — TPipe calls this when an agent writes "ext-system"
ContextBank.registerWriteBackFunction("ext-system") { key, window ->
    postgresClient.execute(
        "INSERT INTO ext_state (key, payload) VALUES (?, ?) " +
        "ON CONFLICT (key) DO UPDATE SET payload = ?",
        key, serialize(window), serialize(window)
    )
}

The five built-in StorageMode values cover the standard cases: MEMORY_ONLY, MEMORY_AND_DISK, DISK_ONLY, DISK_WITH_CACHE, REMOTE. Any of them is set per page key, not globally. The "story" page can be MEMORY_AND_DISK for fast local access, the "audit-log" page can be DISK_ONLY for memory efficiency, the "shared-world" page can be REMOTE for cross-node coordination. The agent code does not change when the storage mode changes. The developer changes one line of configuration and the contract holds.

This is the architectural seam that the rest of the substrate depends on. The agent’s call site is ContextBank.getContextFromBank(key). The function returns a ContextWindow. The agent code reads lorebook entries from the window, fires triggers, builds a response. None of that code knows, or needs to know, whether the window came from a .bank file on local disk, an HTTP round-trip to a remote MemoryServer, a function the developer registered that pulls from Postgres, or a function the developer registered that routes to a RAG system. The contract is the same.

The contract is also what kills the hyper-band-aid, bolted-on pattern that dominates most agent frameworks. Components from different teams, designed against different assumptions, glued together with custom adapter code: every join is a class of bugs, every glue point is a footgun, every new component requires a new adapter. The TPipe substrate does not ship that pattern because every layer in Ten Trillion Triangles TPipe is built to the same contract, by the same team, against the same assumptions. The agent code calls one set of methods. The developer configures the storage once. The retrieval and write-back behaviors compose without adapters.

What page keys buy you

Page keys are the architectural fact that makes multi-agent state tractable at scale. Without page keys, every retrieval loads the full state graph and every write serializes across the whole bank. With page keys, each retrieval is bounded, each write is bounded, and the mutex contention is bounded.

The canonical pattern at Ten Trillion Triangles TPipe uses four page-key categories:

Per-entity state. "player-1", "npc-alice", "location-blacksmith". Each entity gets its own page. Mutex contention is per-entity.
Per-concern state. "story", "world-rules", "task-list". Each concern is independent of the others.
Per-pipeline state. "pipeline-stage-3", "agent-junction-vote". Pipeline state lives in its own page, separately from agent state.
Per-session state. "session-2026-06-26-turn-47". Session-scoped state does not contaminate cross-session state.

The pattern is recursive. A page key can hold a ContextWindow that contains lorebook entries, mini-bank pages, and a dictionary. The page key is the unit of retrieval. The lorebook entries inside the page are the unit of activation. The two scales compose.

The architectural fact is that page keys make multi-agent systems tractable. A two-agent system can run without page keys and survive. A ten-agent system starts to race. A hundred-agent system falls apart. Page keys are the difference between a multi-agent demo and a multi-agent production system.

Why this distinction matters now

RAG over documents is a vector database problem. State across 120-turn autonomous agents is a ContextBank problem. Production agent systems need both, and conflating them produces agents that retrieve embeddings when they should be reading state, or read state when they should be retrieving embeddings. The substrate is the architecture that decides which one a given retrieval is.

The Convex context window manages lorebook entries, weighted keys, and lorebook selection across the agent’s lifetime. The ContextBank holds persistent state across turns and across agents. The P2P layer coordinates which agents write to which page keys. Together, these are the Ten Trillion Triangles TPipe substrate’s memory architecture. The vector database sits beside the substrate for corpus retrieval, not inside it for state.

If you are building a production agent system, the first decision is the page key taxonomy. The second decision is which retrieval is a vector database problem and which is a ContextBank problem. Once those decisions are made, the architecture is tractable. The errors come from skipping the decisions and reaching for the default.

Frequently Asked Questions

What is the difference between ContextBank and a vector database?

A vector database stores embeddings and retrieves by cosine similarity, dot product, or Euclidean distance over high-dimensional float vectors. ContextBank stores weighted lorebook entries keyed by strings, retrieves by substring-triggered activation across named page keys, and writes through mutex-guarded concurrent operations. A vector database is a similarity index. ContextBank is agent state architecture. They solve different problems and conflate them at your peril.

When should I use a vector database?

Use a vector database when you are retrieving over documents, semantic similarity matters more than exact match, and your corpus is too large to keep in memory. Retrieval-augmented generation over a corpus of PDFs, semantic search over a knowledge base, finding similar tickets in a support archive. Vector DBs are the right tool. ContextBank is not built for that.

When should I use ContextBank instead of a vector database?

Use ContextBank when you need persistent agent state across runs, when multiple agents must read and write the same memory concurrently without corruption, when lorebook entries should activate by substring triggers rather than semantic similarity, and when your retrieval must be deterministic across identical inputs. This is the production agent substrate pattern, not the document retrieval pattern.

Can ContextBank and a vector database coexist in the same system?

Yes. Production agent systems at Ten Trillion Triangles use vector databases for corpus retrieval and ContextBank for state. The substrate is the architecture that decides which one a given retrieval is. Conflating them produces agents that retrieve embeddings when they should be reading state, and agents that read state when they should be retrieving embeddings. The two are not interchangeable.

What is a lorebook entry in ContextBank?

A lorebook entry is a weighted key-value record. The key is a substring trigger. The value is the content to inject into the context window when the trigger fires. Each entry has a weight, optional linked keys that must also be present, alias keys that activate the same entry, and required keys that must be present for the entry to fire. Retrieval is activation by substring, not similarity over embeddings. Identical inputs produce identical activations.

How does ContextBank handle concurrent writes from multiple agents?

ContextBank exposes mutex-guarded variants of every write operation. `emplaceWithMutex`, `updateBankedContextWithMutex`, `swapBankWithMutex` are the production entry points. Plain `emplace` writes directly to the bank's `ConcurrentHashMap`. Concurrent writers using plain `emplace` can race. Mutex-guarded operations serialize access at the page-key level and produce consistent state even when multiple agents write the same page in the same turn. The mutex is the difference between production and prototype.

Why use page keys instead of a single global memory?

Page keys partition state by concern. The "story" page key holds narrative state. The "world" page key holds world rules. The "player-N" page key holds per-player state. Each page is independently retrievable, independently writable, and independently mutex-locked. A single global memory forces every retrieval to load the full state graph and serializes every write across the whole bank. Page keys are what makes multi-agent state tractable at scale.

Does ContextBank support embedding-based retrieval?

No, and that is the architectural fact. ContextBank is built for deterministic state retrieval across named page keys, not for similarity search over unstructured text. If you need semantic similarity over a document corpus, use a vector database. If you need persistent agent state with weighted activation, use ContextBank. Ten Trillion Triangles TPipe supports both through the substrate, but they are separate abstractions with separate APIs.