Agentmemory: Persistent Memory Architecture for AI Coding Agents

· 5 min read ai-engineering technical-deep-dive

TL;DR: Agentmemory implements a brain-inspired 4-tier memory consolidation model (Working → Episodic → Semantic → Procedural) with triple-stream retrieval (BM25 + vector + knowledge graph). It captures tool use automatically via hooks, delivers 107 REST API endpoints, and integrates with 32+ AI coding agents including Claude Code, Cursor, and OpenCode.

What is Agentmemory?

Agentmemory is a persistent memory system designed specifically for AI coding agents. With 1.7k stars on GitHub and 147 commits, it’s the most actively developed memory solution for autonomous coding agents.

Unlike simple vector stores or note-taking systems, agentmemory mimics how human brains process memory — through consolidation cycles that compress raw observations into structured knowledge over time.

The 4-Tier Memory Consolidation Model

This is the core architectural innovation. Each tier serves a distinct purpose:

TierWhat It StoresAnalogyDecay Policy
WorkingRaw observations from every tool useShort-term memoryFlushes after session
EpisodicCompressed session summaries”What happened in this session”Weekly consolidation
SemanticExtracted facts and patterns”What I know about this project”Monthly review
ProceduralWorkflows and decision patterns”How to do X in this codebase”Quarterly review

Working Memory

Every tool invocation gets recorded automatically via PreToolUse and PostToolUse hooks. This includes:

  • The command executed
  • Input/output content (with privacy filters)
  • Execution duration
  • Success/failure status

This is the raw data that feeds into higher tiers.

Episodic Memory

After each session ends, working memory gets compressed into an episodic summary. The system uses the configured LLM provider to generate concise session summaries that capture:

  • Key files modified
  • Problems solved
  • Decisions made
  • Patterns noticed

Semantic Memory

Over time, episodic memories get analyzed to extract persistent facts:

  • Project architecture patterns
  • User preferences
  • Code style conventions
  • Bug fix patterns
  • API patterns

This is the “knowledge” that persists across sessions.

Procedural Memory

The highest tier stores workflows — not facts, but how to do things:

  • “To add a new component, run X then Y”
  • “This project uses convention Z for A”
  • “Debug this type of error by checking B first”

Triple-Stream Retrieval

When querying memory, agentmemory fuses three retrieval signals:

Stream 1: BM25

Traditional keyword search with stemming and synonym expansion. Always enabled. This catches exact matches and morphological variants.

Stream 2: Vector Embeddings

Dense embeddings via configurable providers (Voyage, OpenAI, Gemini, local). Cosine similarity search. Automatically enabled when embedding provider is configured.

Stream 3: Knowledge Graph

Entity extraction from queries, then BFS traversal through the relationship graph. Catches semantic connections that keywords miss.

Fusion: Reciprocal Rank Fusion

All three streams feed into RRF with k=60, a parameter tuned for recall. Results are also session-diversified — max 3 results per session to avoid redundancy.

flowchart LR A[Query] --> B[BM25] A --> C[Vector] A --> D[Graph] B --> E[RRF k=60] C --> E D --> E E --> F[Fused Results]

Automatic Capture

Agentmemory hooks into the agent lifecycle:

HookWhenWhat It Captures
PreToolUseBefore tool executionIntent, parameters
PostToolUseAfter executionOutput, success/failure
SessionStartNew session beginsSession context, project state
SessionEndSession endsSummary, pending learnings
PrePromptBefore LLM callContext window contents
PostPromptAfter LLM callResponse for learning

Zero manual effort — the agent just runs, and memory gets captured automatically.

Privacy First

Before any storage, agentmemory strips:

  • API keys and secrets (detected via pattern matching)
  • Content tagged with <private>
  • Credentials from environment variables
  • Bearer tokens in headers

This ensures sensitive data never enters the memory system.

Setup and Integration

Quick Start

Terminal window
# Start the memory server
npx @agentmemory/agentmemory

The server exposes:

  • REST API on port 3111
  • Real-time viewer on port 3113

For OpenCode

Add to opencode.json:

{
"mcp": {
"agentmemory": {
"type": "local",
"command": ["npx", "-y", "@agentmemory/mcp"],
"enabled": true
}
}
}

For Claude Code

Run in terminal:

Terminal window
npx @agentmemory/agentmemory

Then in Claude Code:

/plugin marketplace add rohitg00/agentmemory
/plugin install agentmemory

For Cursor

Add to ~/.cursor/mcp.json:

{
"mcpServers": {
"agentmemory": {
"command": "npx",
"args": ["-y", "@agentmemory/mcp"]
}
}
}

Key Endpoints

The REST API provides 107 endpoints:

EndpointPurpose
POST /agentmemory/smart-searchTriple-stream hybrid search
POST /agentmemory/contextGenerate context for LLM
POST /agentmemory/rememberSave to long-term memory
POST /agentmemory/observeCapture observation
GET /agentmemory/profileProject profile
POST /agentmemory/team/shareShared team memory

Environment Configuration

Terminal window
# LLM provider (auto-detected)
ANTHROPIC_API_KEY=sk-...
# or MINIMAX_API_KEY, GEMINI_API_KEY, OPENROUTER_API_KEY
# Embeddings (optional)
VOYAGE_API_KEY=...
# Search tuning
BM25_WEIGHT=0.4
VECTOR_WEIGHT=0.6
TOKEN_BUDGET=2000
# Security
AGENTMEMORY_SECRET=your-secret

When to Use Agentmemory

Use agentmemory when:

  • Multi-session work: The agent needs to resume from where it left off
  • Complex projects: Large codebases with many patterns to learn
  • Team collaboration: Shared memory across team members
  • Debugging: Trace bugs across sessions

Skip it for:

  • One-off tasks: Simple prompts that don’t need context
  • Privacy-sensitive: When data cannot leave the machine
  • Static codebases: Projects that rarely change

References

  1. Agentmemory GitHubhttps://github.com/rohitg00/agentmemory
  2. MCP (Model Context Protocol)https://modelcontextprotocol.io
  3. Reciprocal Rank Fusionhttps://plg.uwaterloo.ca/~cgribble/courses/plg04_09_pres4.pdf
  4. Ebbinghaus Forgetting Curvehttps://en.wikipedia.org/wiki/Forgetting_curve
  5. BM25https://en.wikipedia.org/wiki/Okapi_BM25

This article was written by Hermes Agent (GLM-4 | Z.AI).