iCog
← The iCog Journal
Comparisons

The Zep Alternative With Self-Hosting

Looking for a Zep alternative you can self-host? On April 2, 2025, Zep deprecated its open-source Community Edition, the only remaining self-host path is the raw Graphiti library plus a graph database. CognitiveX gives you temporal recall and learning memory on PostgreSQL + pgvector, the infrastructure your team already runs. No Neo4j. No Graphiti service to babysit.

What happened to self-hosting Zep

Zep deprecated Zep Community Edition on April 2, 2025. The repo stays open under Apache 2.0, but it receives no further updates and no active support, the CE code was moved into a legacy/ folder. Zep's own stated reason was candid: managing two related-but-different products "presented real challenges… we under-invested in the open-source version."

The practical consequence is what matters for anyone evaluating Zep today: the full managed Zep platform is cloud-only. There is no self-hostable equivalent. The only remaining open-source path is the underlying Graphiti library, and running Graphiti yourself is a graph-database operations commitment, not a drop-in.

There was more churn alongside the CE deprecation. On May 31, 2025, Zep retired its entire memory.* API surface, memory.search, session summaries, fact CRUD, Document Collections, in favor of graph.*, requiring an SDK v2.5+ upgrade. If you'd built on the older surface, you were migrating regardless.

What self-hosting Zep now actually requires

To self-host Zep-style memory today, you stand up Graphiti, and Graphiti needs a compatible graph database underneath it: Neo4j (default, v5.26+), FalkorDB, or Kuzu. Add the embedding provider and the LLM provider Graphiti calls during ingestion, and you're provisioning, monitoring, and maintaining at least three moving systems, plus owning the graph-DB schema migrations and production ops yourself.

There's a cost dimension too, and it's real rather than FUD. Every add_episode ingestion runs an LLM-based entity-and-relationship extraction pass, roughly 500-2000 input and 200-800 output tokens per episode, and a single activity can fire several LLM and embedding calls (node extraction, edge extraction, dedup). At volume, that extraction step dominates operational cost and adds ingest latency. The Graphiti project even has an open issue requesting an extraction-free add_episode path.

If you'd otherwise consider Zep Cloud to dodge the ops: pricing is credit-based (an episode ≤350 bytes is 1 credit, +1 per additional 350 bytes, a 1,200-byte episode costs 4 credits). The free tier is 1,000 credits/month, suitable for testing only, and the next paid tier jumps to roughly $125/mo with no intermediate option.

CognitiveX: temporal recall without the graph-DB ops

CognitiveX's wedge here is simple and checkable: it runs on PostgreSQL + pgvector, the database most teams already operate. No graph database to provision, no Graphiti service to keep alive, no separate schema-migration surface to own. You get temporal recall on infrastructure you already understand.

Beyond the ops story, CognitiveX's design goal is memory that consolidates rather than only stores. Episodic events get promoted into semantic facts, salience-weighted decay lets stale memory fade while important memory persists, and patterns and skills are extracted over time, an ACT-R and Generative-Agents lineage. Crucially, the consolidation and salience steps are deterministic algorithms; the LLM only renders language at the end. (We unpack the mechanics in memory consolidation for AI agents.)

CognitiveX also ships as an MCP server, so it plugs into Claude, Cursor, ChatGPT, and Codex as one shared memory rather than an SDK you wire into a single agent you build.

Where Zep is genuinely the right call

This isn't a strawman. Zep's bi-temporal knowledge graph, tracking both when a fact was true and when you learned it, is genuinely strong for enterprise temporal reasoning over evolving facts, and Zep's published paper reports gains on the DMR and LongMemEval benchmarks against the baselines it tested, alongside a large latency reduction in their harness.

If your workload is heavy multi-hop relational and temporal reasoning over a changing fact base, and being cloud-only is acceptable, Zep is a legitimate, capable choice. CognitiveX's pitch is narrower and deliberate: temporal recall plus learning, without standing up and operating a graph database. (For the broader landscape, see Mem0 vs Zep vs Letta vs Cognee.)

A note on benchmarks in this space

It's worth being honest about benchmark numbers, because this category has a credibility problem. Zep originally claimed 84% on the LoCoMo benchmark. Mem0's co-founder published a correction putting the protocol-correct figure at 58.44% ± 0.20, the flaw being that Zep counted the adversarial Category-5 answers in the numerator while excluding Cat-5 from the denominator (where the benchmark designates them for exclusion), alongside a modified prompt and single-run rather than 10-run-averaged reporting.

We're not citing that to dunk on Zep, its paper's DMR and LongMemEval gains look real. We're citing it because CognitiveX has no published benchmark yet, and we won't hand you a number we can't independently stand behind. In a space where vendor benchmarks are actively contested, we'd rather earn the evaluation than quote a contested score.

Zep vs CognitiveX at a glance

Zep (self-host) CognitiveX
Self-hostable full platform No (cloud-only; CE deprecated Apr 2, 2025) Yes
Storage / infra Graphiti + Neo4j / FalkorDB / Kuzu PostgreSQL + pgvector
Systems to operate 3+ (graph DB, Graphiti, embeddings, LLM) Your existing Postgres
Memory model Bi-temporal knowledge graph Consolidating memory (episodic→semantic, salience decay)
Ingest cost driver LLM entity/edge extraction per add_episode Deterministic pipeline; LLM renders language last
Integration SDK (graph.*) MCP server (Claude, Cursor, ChatGPT, Codex)
Best for Multi-hop temporal reasoning, enterprise, cloud-OK Temporal recall + learning without graph-DB ops

Frequently asked questions

Is Zep Community Edition still maintained, can I still self-host Zep? The Community Edition was deprecated on April 2, 2025. The code remains open under Apache 2.0 in a legacy/ folder but gets no further updates or active support. The full managed platform is cloud-only.

What replaced Zep Community Edition? There's no managed self-host replacement. The only remaining open-source path is the underlying Graphiti library, which you operate yourself.

Do I need Neo4j to self-host Zep or Graphiti? You need a compatible graph database, Neo4j 5.26+ (the default), FalkorDB, or Kuzu, plus the Graphiti service and embedding/LLM providers. That's at least three systems to run.

Can I get temporal agent memory using just Postgres and pgvector? Yes. CognitiveX runs temporal recall and consolidating memory on PostgreSQL + pgvector, no graph database required, and exposes it over MCP.

Is a temporal knowledge graph necessary for agent memory, or is it overkill? It depends on the workload. For heavy multi-hop relational reasoning over evolving facts, a temporal KG like Zep's is genuinely valuable. For temporal recall plus learning without graph-DB operations, a Postgres + pgvector consolidating memory like CognitiveX is the lighter, lower-ops fit.


Zep took the self-hostable option off the table. CognitiveX is temporal recall and learning memory you can actually run on infrastructure you already have. Try CognitiveX →