iCog
← The iCog Journal
Integrations

Long-Term Memory for LangGraph Agents

LangGraph has built-in long-term memory: a BaseStore that persists JSON documents organized by namespace and key, with opt-in semantic search. It stores exactly what you put and retrieves by vector similarity, but it doesn't consolidate, deduplicate, decay, or promote recurring episodes into durable facts. For that you need an added layer. Here's the accurate picture, and where CognitiveX fits.

How long-term memory works in LangGraph

LangGraph separates two kinds of memory. Short-term memory is thread-scoped, a checkpointer persists the state of a single conversation. Long-term memory is cross-thread and lives in a BaseStore: a key-value store of JSON documents organized by a namespace (a tuple, like nested folders) and a key (a string, like a filename), with a value dict as the payload.

The core surface is small and clean:

from langgraph.store.memory import InMemoryStore

store = InMemoryStore()

# write a memory
store.put(("users", "alice"), "pref-tone", {"text": "prefers terse answers"})

# read it back
item = store.get(("users", "alice"), "pref-tone")

# search across the namespace
results = store.search(("users", "alice"), query="how does she like replies", limit=5)

You pass the store into your agent, for example via create_react_agent(..., store=store) or by attaching it at compile time, and call put / search inside your nodes to write and recall memories as the agent runs.

Picking a store: InMemoryStore vs PostgresStore

InMemoryStore keeps everything in RAM. It's perfect for prototyping and tests, but it resets on every restart, not a production memory. For production, use PostgresStore: it takes a connection string and requires a one-time store.setup() to create its tables. Same interface, durable backing.

Semantic search is opt-in

By default, store.search does filtering and exact lookups, not meaning-based retrieval. To get vector similarity you configure an embedding index when you create the store:

from langgraph.store.memory import InMemoryStore

store = InMemoryStore(
 index={
 "embed": "openai:text-embedding-3-small",
 "dims": 1536,
 "fields": ["$"], # or specific JSON paths like ["text", "summary"]
 }
)

With an IndexConfig in place, search(namespace, query=...) ranks results by embedding similarity. Without it, semantic queries silently fall back to filter/exact behavior. This is a common gotcha: developers assume search is semantic out of the box and wonder why recall feels literal.

The thing BaseStore deliberately doesn't do

Here's the honest pivot. BaseStore is an excellent persistence layer. It stores exactly what you put, and it retrieves by similarity plus filters. What it does not do:

  • Decide what's actually worth keeping versus noise
  • Merge duplicate or near-duplicate memories
  • Promote a recurring episode ("user asked about X again") into a durable semantic fact
  • Let stale memories fade so the store doesn't grow unbounded
  • Resolve contradictions when a new fact overrides an old one

This isn't a missing feature, it's by design. The clearest evidence is that LangChain ships LangMem as a separate layer specifically to add these behaviors, and LangMem's own docs describe it as using LangGraph's BaseStore for persistence. If the store consolidated, LangMem wouldn't need to exist. The store is the disk; consolidation is a layer on top.

The options for adding consolidation

Once you accept that the store doesn't manage memory, you're choosing a manager. The three common paths:

Layer What it is How it consolidates
LangGraph BaseStore Native persistence (Postgres / in-memory) Nothing, stores what you put, retrieves by similarity
LangMem LangChain's add-on over BaseStore Background manager extracts, merges, and resolves conflicts via LLM calls
Mem0 Drop-in memory backend for LangGraph Runs an LLM fact-extraction + dedup pass on every add()

Both LangMem and Mem0 genuinely consolidate, that's the right thing to do. The distinction worth understanding is how. Their consolidation is LLM-per-operation. LangMem's background manager calls a model to extract and merge. Mem0 runs an LLM extraction-and-dedup pass on every write, which adds latency and is typically recommended to run async, off the hot path. That works, but it means a model call gates every memory you store, and behavior can drift when you swap models.

For a deeper treatment of why consolidation matters at all, see why memory consolidation is the hard part of agent memory.

Where CognitiveX fits: a consolidating memory layer

CognitiveX integrates as the long-term memory behind your LangGraph agent, as your store / MCP memory tool, and adds the consolidation layer inside the memory system itself rather than as a model-call wrapper around it:

  • Episodic → semantic promotion, a recurring episode is promoted into a durable fact, so "asked about X three times" becomes "cares about X."
  • Salience-weighted decay, memories carry a salience signal; low-value ones fade rather than accumulating forever, which keeps recall sharp and the store bounded.
  • Pattern and skill extraction, repeated behavior is distilled into reusable structure (an ACT-R + Generative-Agents lineage).

The honest differentiator versus LangMem and Mem0 isn't "we consolidate and they don't", they do. It's deterministic algorithm versus LLM-call-per-write. CognitiveX's consolidation and salience are deterministic algorithms with defined input and output schemas; the LLM only renders language at the very end. Two practical consequences: there's no model call gating every write, and the behavior of what survives doesn't shift when you change models, only the prose does.

For how this compares against the broader field, see our AI memory layer comparison and the Mem0 alternative breakdown.

A note on benchmarks, because you've seen the fights

You may have seen numbers thrown around. Mem0 self-reports strong LOCOMO results (e.g. 66.9% LLM-as-judge versus a 52.9% baseline, large latency and token reductions). Those are vendor-reported, and they're contested, Zep published a public rebuttal questioning the methodology, and LOCOMO itself has documented data-quality issues (missing ground truth, mis-attribution, underspecified questions).

So we'll be straight with you: there is no neutral, agreed-upon benchmark for agent memory yet. CognitiveX doesn't have a published score, and we won't quote one we can't stand behind. We compete on architecture, deterministic consolidation, salience-weighted decay, episodic-to-semantic promotion, not on a leaderboard that's currently a vendor argument.

FAQ

Does LangGraph have built-in long-term memory? Yes. It's the BaseStore, JSON documents organized by namespace and key, with put, get, search, and delete. It's a persistence layer, not a memory manager.

What's the difference between LangGraph short-term and long-term memory? Short-term is thread-scoped, persisted by a checkpointer for a single conversation. Long-term is cross-thread, persisted in a BaseStore and shared across sessions.

Does LangGraph's store support semantic search? Yes, but it's opt-in. You must configure an embedding index (embed + dims, e.g. openai:text-embedding-3-small at 1536 dims). Without it, search is filter/exact only.

Does LangGraph automatically consolidate or summarize memories? No. BaseStore stores exactly what you put. For extraction, merging, decay, or episodic-to-semantic promotion you need an added layer, LangMem, Mem0, or CognitiveX.

LangMem vs Mem0 vs the LangGraph store, what's the difference? The store is persistence. LangMem and Mem0 are consolidation layers on top of it, both LLM-driven. CognitiveX is a consolidation layer whose consolidation and salience are deterministic, with the LLM only rendering language last.

Try CognitiveX as your LangGraph memory

LangGraph gives your agent a place to put memories. CognitiveX decides which ones survive, promoting recurring episodes into facts, letting stale memories fade, and doing it with deterministic algorithms instead of a model call on every write. Try CognitiveX →