iCog
← The iCog Journal
Comparisons

Mem0 vs Zep vs Letta vs Cognee vs CognitiveX (2026)

mem0, Zep, Letta, and Cognee solve four different problems: mem0 is a drop-in extraction layer, Zep is a temporal knowledge graph, Letta is an agent runtime, and Cognee turns documents into graphs. CognitiveX is the fifth axis, it consolidates and learns, not just stores. Here's how to pick, by architecture rather than leaderboard.

The short answer

These tools get lumped together as "AI memory," but they sit at different layers of the stack. Choosing well is mostly about matching the tool to the shape of your problem:

  • mem0, lowest integration friction. You want memory bolted onto existing chat code today.
  • Zep, you need to know what was true at time T as user state changes. Cloud only now.
  • Letta, you're building an autonomous agent that should manage its own context like an operating system manages RAM.
  • Cognee, you have documents, runbooks, and tickets to turn into a queryable knowledge graph.
  • CognitiveX, you want memory that consolidates over time, promoting episodes into facts and decaying by salience, exposed over MCP for personal use.

Feature matrix

mem0 Zep Letta Cognee CognitiveX
Architecture Fact extraction (vector + graph + KV) Temporal knowledge graph (Graphiti) Agent runtime (tiered memory) Doc → knowledge graph (ECL) Consolidation engine
Self-host ✓ (Apache 2.0) ✗ (CE deprecated) ✓ (Apache 2.0) ✓ (Apache 2.0)
Temporal / bi-temporal Partial ✓ (strongest) , , Salience-weighted
Graph Optional , ,
Consolidation / learning Add/update/delete Fact invalidation Partial (sleep-time) , ✓ (episodic→semantic)
MCP-native ✓ (OpenMemory) , ✓ (cognee-mcp)
Best for Drop-in recall Changing state Autonomous agents Document corpora Personal continuity

mem0: the drop-in extraction layer

mem0's genuine differentiator is integration friction, it's the lowest in the field. It mirrors the OpenAI client interface, so client.chat.completions.create(...) works with the same signature and memory plugs into existing chat-completion code with almost no rewrite. Under the hood it combines three stores, a vector DB for semantics, a graph DB for relationships, and key-value for fast facts, with a wide range of supported vector backends (Qdrant, Chroma, Weaviate, Milvus, PGVector, Redis, and more).

An LLM extracts salient facts from each turn and decides ADD, UPDATE, or DELETE against what's stored. It's Apache 2.0, fully self-hostable as a three-container stack, and ships OpenMemory, a local-first MCP server (Docker + Postgres + Qdrant) exposing add_memories / search_memory over SSE. It carries the most community traction of the group, with a large GitHub following and venture backing.

Where it stops: memory is extracted facts, not a time-aware model of changing state, and the loop is add/update/delete, not promotion or decay. If you've outgrown that, our mem0 alternative breakdown goes deeper.

Zep: temporal knowledge graph (cloud only now)

Zep is the strongest answer to "what was true at time T given a moving conversation." It's built on Graphiti, its open-source engine, and is bi-temporal, it tracks both when a fact was true and when it was learned, invalidating and superseding facts as state changes. That makes it a strong fit for enterprise apps where user and entity state shifts over time.

The critical fact most comparisons get wrong: Zep deprecated its Community Edition in April 2025 (announced April 2). The repo stays Apache 2.0 but receives no updates or support, with further feature retirements signaled for early 2026. The only self-hostable path today is building directly on the Graphiti library; the full memory product is Zep Cloud only. If self-hosting is a requirement, that's disqualifying, see our Zep self-hosting alternative.

Letta: agent runtime, not a memory layer

Letta (formerly MemGPT) is a different category, an OS-inspired agent runtime. It models memory in tiers: core memory that's always in-context as editable memory blocks, archival memory that's searchable long-term, and recall memory for conversation history. Agents self-edit their own memory through tools. It also runs sleep-time agents (enable_sleeptime=true) that manage memory asynchronously, non-blocking consolidation separate from the conversation agent, the closest any competitor comes to a learning loop.

State is persisted to a database by default (Postgres, with Alembic migrations), so it's "persistence-by-default" rather than keeping state in Python variables. It's Apache 2.0, self-hostable with a REST API and Python/TS SDKs, and speaks MCP both ways.

Where it fits, and doesn't: it's the right call for agents that run autonomously for days and need to manage context like RAM versus disk. It's also the heaviest and most opinionated, a whole framework, not a memory you bolt onto an existing stack. Overkill if you just need recall.

Cognee: documents into a knowledge graph

Cognee is built around the ECL pipeline (Extract, Cognify, Load). It ingests from many source types; the "cognify" phase chunks, embeds, extracts entities, and maps relationships, then loads vectors and graph edges into backends (Postgres + Kuzu). The result is a queryable knowledge graph over raw docs, runbooks, and tickets.

It's Apache 2.0 and self-hosted, with deploys for Docker, Modal, Railway, Fly, and Render, and ships cognee-mcp. It is an actively maintained open-source project with a growing contributor base and seed-stage venture backing.

Best fit: deep retrieval over document and institutional-knowledge corpora, not lightweight per-user conversational memory, where its document-to-graph orientation and heavier setup work against it.

A note on the benchmark war

You'll see leaderboard numbers thrown around, Zep, mem0, and others publish memory benchmark results (DMR, LongMemEval, LOCOMO). Treat them with suspicion. Most are vendor self-reports on narrow benchmarks, run under settings the vendor chose.

LOCOMO in particular is actively disputed. mem0's team filed a public issue on Zep's papers repo alleging methodology problems, an excluded category counted inconsistently, added prompt instructions, changed templates, and single-run versus multi-run averaging, and the two vendors report materially different numbers for the same setup. The honest takeaway: cross-vendor memory benchmarks are not yet trustworthy or independently reproduced. That's precisely why CognitiveX posts no benchmark score, our edge is architectural and stateable without a leaderboard claim.

Where CognitiveX fits: the consolidation axis

Here's the honest frame. All four competitors are, at their core, retrieval systems: mem0 extracts facts, Zep builds a temporal graph, Cognee builds a knowledge graph from documents, Letta lets the agent self-edit blocks. They retrieve well. None of them run an offline consolidation process that promotes episodic memories into semantic facts, extracts patterns and skills, and decays by salience, the ACT-R and Generative-Agents lineage.

That's CognitiveX's column in the matrix: consolidation and learning. Where mem0's loop is add/update/delete and Zep's is fact-invalidation, CognitiveX's is episodic→semantic promotion plus salience-weighted decay plus pattern extraction during an offline "dream" pass. These are deterministic algorithms, the LLM only renders language at the very end (LLM-as-infrastructure). It ships as an MCP server like OpenMemory and cognee-mcp, but is positioned as a personal cognitive system at icog.app, not only an agent-dev SDK. For the broader landscape, see our AI memory layer comparison.

FAQ

What's the difference between mem0, Zep, Letta, and Cognee? mem0 is a drop-in fact-extraction layer; Zep is a bi-temporal knowledge graph for changing state; Letta is a full agent runtime with self-editing tiered memory; Cognee turns document corpora into a knowledge graph. Different layers of the stack, not interchangeable.

Can I still self-host Zep? Not the full product. Zep deprecated its Community Edition in April 2025; only the Graphiti library remains maintained open source. The complete memory product is Zep Cloud only.

Which AI memory framework is best for self-hosting? mem0, Letta, and Cognee are all Apache 2.0 and self-hostable, as is CognitiveX. Zep is the exception, cloud only since its CE deprecation.

What's the difference between a memory layer and an agent runtime? A memory layer (mem0) stores and recalls facts you bolt onto existing code. An agent runtime (Letta) is the whole execution environment, the agent manages its own memory as part of running.

What's the difference between storing memory and consolidating it? Storing keeps facts and retrieves them. Consolidating changes the memory over time, promoting episodes into durable semantic facts, extracting patterns, and decaying low-salience entries. That learning loop is CognitiveX's differentiator.


The other four are excellent at retrieval. The open question is whether your memory should also learn. If you want a personal memory that consolidates over time and plugs into every AI tool over MCP, try CognitiveX →.