COGNITIVEX · RESEARCH
Research at CognitiveX
The science behind the LCM. We treat memory as a cognitive substrate, not a cache, and we study how a system can remember, reflect, and change over time rather than answer and forget.
WHAT WE STUDY
Living memory as a cognitive substrate.
A language model is a fixed function. Text goes in, text comes out, and the interaction is gone the moment the response renders. Our research starts from a different premise: that the part of an intelligent system worth keeping is the memory, and that memory should be a living substrate that accumulates, relates, decays, and reorganizes itself with use.
We organize the work as a single guiding principle. The structure comes first, the language comes last. Every cognitive module has a defined input and output schema, and a model is used only to render the final result into prose. If removing the model breaks the language but not the logic, the algorithm is positioned correctly. If it breaks the logic, the algorithm was never deep enough. This is how we keep the model at the edge of the system and the cognition at its center.
THE RECURSIVE LOOP
Cognition is a loop, and the loop modifies itself.
The central object of study is a recursive cycle in which the output of each stage reshapes the input of the next, and the final stage rewrites the first. The memory that answers tomorrow is not the memory that answered today.
- 1
Memory
Each interaction is written into a four-tier substrate, not a flat log. Records carry salience and decay, so what is used survives and noise fades.
- 2
Reflection
The system reasons about its own memory: surfacing contradictions, recurring themes, and gaps in what it knows, rather than treating every record as equally true.
- 3
Cognition
Specialized algorithms (pattern detection, salience scoring, relationship synthesis) decide what is relevant to the present moment and how the pieces connect.
- 4
Decision
A model renders the structured result into language as the final step. Swap the model and quality shifts; the behavior of the system does not.
- 5
Learning and evolution
The outcome returns as a learning signal. Salience shifts, relationships reweight, and overnight consolidation compresses scattered events into durable patterns.
THE SUBSTRATE
Four tiers, scored by salience, consolidated in sleep.
The substrate is concrete. You can describe what each part should structurally produce without naming a model, which is precisely the test we hold every module to.
Four-tier memory
- Semantic: facts, architecture, durable knowledge
- Episodic: events and sessions, anchored in time
- Procedural: how-tos, patterns, ways of working
- Foundational: identity, values, core beliefs
Salience and decay
- recall depth is scored, so the right memories surface at the right time
- use strengthens a memory; neglect lets it fade, the way attention should
- contradiction and recency are weighed, not just vector similarity
Dream consolidation
- offline passes compress related events into reusable patterns
- relationships between memories are synthesized, not merely stored
- cross-agent recall over MCP lets tools share one substrate
Memory consolidation is the part we find most interesting. Biological memory is not written once and read forever; it is replayed and reorganized during sleep. Our consolidation engine borrows that shape. It runs offline, reads back recent episodic memory, and writes denser semantic and procedural memory that captures the pattern rather than every instance. The result is a substrate that gets more useful as it gets older, instead of accumulating noise. The architecture behind this is detailed at research.cognitivx.io.
INFLUENCES AND LINEAGE
Standing on cognitive science, not just deep learning.
The work draws on a long tradition of cognitive architecture. Production-rule architectures inform how we separate declarative memory from procedural skill. Work on generative agents informs how reflection and memory streams produce coherent behavior over time. Predictive-processing accounts of cognition inform our interest in systems that model their own state and update on surprise.
We are deliberate about which of these ideas are engineered and which are aspirational. Some are shipped mechanisms you can call today through the cogx platform. Others, such as collective intelligence across agents and anything resembling emergent self-awareness, are stated as research direction and vision, not as delivered capability. The ledger below makes that line explicit.
WHERE THE WORK STANDS
Honest about shipped versus in progress.
A research lab is only credible if it says which claims are load-bearing today and which are direction. Here is the ledger.
| Capability | Status | Notes |
|---|---|---|
| Four-tier memory (semantic, episodic, procedural, foundational) | Shipped | Core write and recall path. |
| Salience-scored recall with decay | Shipped | Recall depth is metered and weighted. |
| Reflection and introspection | Shipped | The system reasons about its own state. |
| Pattern detection | Shipped | Structure found across scattered memories. |
| Dream consolidation (offline) | Shipped | Compresses events into durable patterns. |
| Cross-agent recall over MCP | Shipped | Tools share one memory substrate. |
| Long-horizon memory evaluation | Shipped | See the LongMemEval benchmark page. |
| Collective intelligence across agents | In progress | Research direction, stated as vision. |
| Emergent self-modeling | In progress | Long-horizon aspiration, not a claim. |
BENCHMARKS
We measure memory where it is hard: over long horizons.
Short-context question answering is solved enough to be uninteresting. The hard problem is recall across many sessions and long stretches of time, where a system must retrieve the right memory from a large, noisy history rather than from the last few thousand tokens.
We evaluate the LCM on long-horizon memory tasks, including LongMemEval, a benchmark for recall over extended conversational histories. Our methodology, the exact task setup, and current results live on the dedicated page so the numbers stay in one authoritative place rather than being restated and drifting out of date. See the LongMemEval benchmark page for the setup and results.
Our principle for evaluation mirrors our principle for architecture. A benchmark should measure the cognition, not the prose. We hold the model fixed and vary the memory substrate, so that an improvement on the score reflects better recall and consolidation rather than a better-sounding renderer.
Read deeper, or build on the substrate.
The full architecture writeups live at research.cognitivx.io. When you want to put living memory into your own app, the cogx platform exposes the same cognition through an SDK, an HTTP API, and MCP.