Why does an agent need a memory MCP server?

Without persistent memory, every agent conversation starts cold. The agent forgets that the user's project is named Octane, that they prefer Python over JavaScript, that the last support ticket was about webhook delays. A memory MCP server gives the agent typed read/write access to its own past. The agent picks what to write; the server enforces the schema.

What are the three memory patterns?

Key-value: store named facts ('user_pref_lang' = 'python'). Simplest. Best for stable preferences and identity. Fact graph: store subject-predicate-object triples (user 'works_on' project_octane; project_octane 'uses_lang' python). Best when relationships matter — who is connected to what. Hybrid: KV for hot identity facts, fact graph for evolving relationships, vector index over both for retrieval. Best when the agent's domain is broad enough to need similarity-search.

When is key-value memory enough?

When the things you want to remember are atomic facts with stable keys — preferences, account settings, last-seen-at timestamps, role. If the agent never needs to ask 'what's connected to X?', KV is enough. Most starter agents are in this category for the first 6 months.

When do I need a fact graph?

When you find yourself encoding relationships as awkward key naming — 'project_octane_team_member_5' or 'user_42_owns_repo_3'. A graph natively stores the relationship and lets the agent traverse it: 'who else is on this team?' becomes a 1-hop traversal, not a key-pattern scan. The break-even is around 100 entities or when relationships start mattering more than attributes.

What tools should the memory MCP server expose?

Four tools cover most agents. remember(key, value, namespace) writes a fact. recall(key, namespace) reads it back. search_memory(query, limit) does semantic search. forget(key, namespace) deletes. For fact graphs, add link(subject, predicate, object) and traverse(subject, predicate, max_depth). Vector search is a separate tool because its cost profile is different.

How is this different from a vector database?

A vector DB stores embeddings and returns nearest neighbors. That's one piece of a memory system, not the whole thing. The agent also needs symbolic recall ('what's user X's preferred language?' — exact match, not similarity), update semantics ('user changed their preference; overwrite the old value'), and namespacing ('this memory belongs to this user only'). Most production memory systems are a KV store plus a small vector index, not a vector DB by itself.

Can I build the memory server on AppElixir?

Yes. AppElixir collections are KV stores with typed schemas; expose one as a memory namespace, get remember/recall/forget for free. For fact-graph memory, the platform supports relation fields, and the graph traversal tool is auto-generated. Vector search is a connector to your embedding store. Hosted memnode.dev is the alternative if you want a dedicated memory service rather than building it on a general no-code platform.

MCP Server for Agent Memory: Give Your Agent a Persistent Brain

Every agent ships without memory. Then someone uses it for a second conversation and the magic breaks. The agent doesn't remember they prefer Python. Doesn't remember their project is called Octane. Doesn't remember the last support ticket was about webhook latency. Every session starts cold.

The fix is a memory MCP server: a place the agent writes facts to and reads facts from, typed and namespaced. This article is about what that server should look like, and which of the three memory patterns is right for your agent.

The three patterns

Pattern 1: key-value memory

Named facts. user_pref_lang = "python". last_seen_at = "2026-05-25". billing_plan = "ruby".

This is the boring answer and the right one most of the time. Three properties make it good for agents:

Stable keys. The agent picks a key once and reads it back forever. No "I remembered this but I called it user_lang last time, not lang_pref."
Atomic updates. Overwrite is the default. No reconciliation, no merging, no "is this an update or a new fact?"
Bounded scope. Namespacing is just key prefixing. user:42:pref_lang beats any graph traversal for "what's user 42's preference?"

The downsides show up when relationships start mattering. If you find yourself writing keys like project_octane_team_member_5 or user_42_owns_repo_3, you have outgrown KV. The relationships should be data, not embedded in the key name.

Pattern 2: fact graph

Subject-predicate-object triples. (user_42, works_on, project_octane). (project_octane, uses_lang, python). (project_octane, has_member, user_53).

Two strengths:

Traversal. "Who else is on this project?" is a 1-hop query, not a key-pattern scan. "What languages does this team use?" is 2 hops.
Symmetric relationships. "User 42 works on project Octane" is the same edge as "project Octane has member user 42" — the graph stores it once, the agent reads it either direction.

The break-even with KV is around 100 entities, or sooner if the agent's questions are already shaped like "what's connected to X?" rather than "what's the value of X?"

Fact graphs are not free. The agent has to remember predicate names; you need a controlled vocabulary or the graph fills with synonyms. Most teams that ship a fact graph publish a small predicate list (10-20 verbs) and reject writes outside it.

Pattern 3: hybrid (KV + graph + vector)

Three layers, picked by tool:

KV for hot identity facts. Preferences, settings, role, recently-active timestamps. Fast, exact-match.
Fact graph for evolving relationships. Project memberships, ownership, "who is involved in what."
Vector index over both for retrieval. Semantic recall: "remember that conversation about webhook latency?" Embeddings find it; the source layer (KV or graph) returns the structured data.

This is what most production memory systems look like once they have been in service for a year. It is also what you should not build on day one — premature optimization, three sources of truth to keep consistent.

The MCP tool surface

Four tools cover the KV layer:

remember(key, value, namespace) — write or overwrite.
recall(key, namespace) — read.
search_memory(query, limit, namespace) — semantic search.
forget(key, namespace) — delete.

For fact graphs, add:

link(subject, predicate, object, namespace) — assert a relationship.
traverse(subject, predicate, max_depth, namespace) — walk the graph.
unlink(subject, predicate, object, namespace) — remove an edge.

The namespace argument is the most important and the most often-screwed-up parameter. It should be derived from the token, not passed by the agent. The agent never sees its own namespace; the server resolves it from the token. Otherwise prompt injection ("write to namespace = user_999") leaks memory across users.

Vector search is a different cost shape

Treat search_memory as a separate tool with separate quotas. KV reads are 1ms and cost ~0. Vector searches are 50-200ms and cost real money (the embedding plus the index hit). Agents that get unlimited search_memory access burn budget and slow down.

Rate-limit it. 10-20 searches per session is enough. Cache embeddings of recent queries. Tell the agent in the tool description that search_memory is more expensive than recall — the agent will respect the hint and use recall when it knows the key.

The traps

Trap 1: trusting the agent to pick good keys

Agents pick keys inconsistently. Today's user_pref_lang is tomorrow's pref_language is next week's preferred_language. Three keys, same fact.

Two fixes. The blunt one: predefine the key space and reject writes to unknown keys. The lenient one: store a controlled vocabulary in the memory itself and have the agent read it before writing. Both work; the blunt one is faster to ship and easier to audit.

Trap 2: defaulting to vector DB

"I'll just store everything as embeddings and search by similarity" is the seductive path. It is wrong for most agents because:

Exact-match queries (what's user 42's plan?) work badly with similarity.
Updates are awkward — you can't easily overwrite an embedding-backed fact.
The cost is significant at scale.

The short version is that vector search is a useful retrieval layer over a structured memory, not a replacement for one. Pure-vector agent memory loses the typed lookups that everyday agent questions ("what's user 42's plan?", "remind me what we decided on X") actually depend on.

Trap 3: shared memory across tenants

"All my agents are friendly; let them share a memory pool" sounds efficient. It is the same mistake as a single admin token: one prompt injection and your competitor's agent reads your customer's preferences. Namespace by tenant from day one. Same rule as the billing MCP server in this article; same blast radius.

Trap 4: storing conversation transcripts as memory

The agent's session log is not memory. Memory is the small set of facts the agent decided are worth keeping. Storing transcripts confuses the retrieval surface — you find the conversation, not the fact. Have the agent summarize each session into 1-5 facts via remember(); discard the transcript or store it separately.

Where to start

KV with namespacing. Four tools. One week to ship.
Vector search as a separate tool once agents start asking "remember when…" — usually 2-3 months in.
Fact graph when relationships start to outnumber attributes in your domain. For most teams, never.
Hybrid as a graduation move, not a starting move.

The general progression: KV is enough for most personal-assistant agents; KV + vector is enough for most domain agents; full hybrid is for agents whose domain is genuinely graph-shaped (teams, hierarchies, networks). Pick the smallest pattern your agent's questions actually need.

Several dedicated memory MCP services exist in 2026; comparing their tool surfaces is worth doing before you decide between rolling your own on AppElixir and using a hosted memory backend.