kb-mcp

Local, Rust-native MCP server for markdown knowledge bases.

AI agents work better when they have structured access to project knowledge. Markdown vaults — Obsidian, plain directories, skill libraries — are where this knowledge already lives. But agents can’t browse a vault. They need search, filtering, and token-efficient retrieval over content you’ve already written.

kb-mcp bridges that gap. Point it at your markdown directories, and it exposes them as MCP tools that any agent can query — no cloud services, no databases, no infrastructure.

Two Search Modes

kb-mcp ships with two search backends, chosen at build time:

BM25 keyword search (default) — Fast, lightweight, zero dependencies beyond the binary. Powered by Tantivy. Ideal for exact keyword queries like “PostgreSQL connection pool” or “rate limit error”.

Hybrid BM25 + vector search (--features hybrid) — Adds semantic similarity via a local ONNX embedding model (BGE-small-en-v1.5). Conceptual queries like “how do agents share state?” now match documents titled “Shared Memory” that keyword search alone would miss. Results are fused with Reciprocal Rank Fusion — keyword precision isn’t lost, it’s augmented. Still fully local, no API keys.

Where kb-mcp Fits

kb-mcp occupies a specific niche: local, zero-infrastructure knowledge base server for curated markdown. It’s not agent session memory, not a cloud service, not a code search tool.

Need	kb-mcp	Alternatives
Serve markdown docs to agents via MCP	Yes — BM25 or hybrid search, token-efficient briefings, write-back	obsidian-web-mcp (remote HTTP, ripgrep)
Agent working memory (session state, preferences)	No — use a dedicated memory system	hipocampus, hmem
Cloud-hosted memory service	No — fully local, single binary	mengram
Semantic code search	No — markdown only	mnemex

The memory-focused projects are complementary — an agent could use hmem for working memory and kb-mcp for its knowledge base. See the full Landscape survey for detailed comparisons.

Features

6 MCP tools — list, search, get, context briefing, write, reindex
CLI parity — every MCP tool works as a CLI subcommand
RON configuration — typed, Rust-native config with comments
Collection model — multiple collections with sections, descriptions, and writable flags
Token-efficient — kb_context returns frontmatter + summary without the full body
Write-back — kb_write creates notes with proper frontmatter in writable collections
~1,700 lines of Rust — small, auditable, single-binary

Built With Itself

kb-mcp ships with everything you need to try it immediately. The included collections.example.ron indexes the project’s own documentation and an AI agent memory vault — so cloning the repo and running kb-mcp list-sections gives you a working knowledge base out of the box.

This isn’t a demo afterthought. The project is both the tool and a practical example of using it. The vault documents AI agent memory patterns. The Researcher Agent uses kb-mcp to research topics and curate findings back into the vault. Every feature gets consumed by the project itself — if it doesn’t work well for us, it won’t work well for anyone.

Point an AI agent at this repo with kb-mcp as an MCP server and it can search the docs, read architecture decisions, and understand how the project works — using the very tool the project builds.

Quick Start

# Install both binaries (CLI `kb` + MCP server `kb-mcp`)
just install

# Or install individually
cargo install --path crates/kb-cli          # installs `kb`
cargo install --path crates/kb-mcp-server   # installs `kb-mcp`

# Create config
cp collections.example.ron collections.ron
# Edit paths to point at your markdown directories

# Use as CLI
kb list-sections
kb search --query "your query"

# Use as MCP server (register in .mcp.json — binary name is still `kb-mcp`)

Goals

Why kb-mcp Exists

AI agents work better when they have structured access to project knowledge. Markdown vaults — Obsidian, plain directories, skill libraries — are where this knowledge lives. But agents can’t browse a vault. They need search, filtering, and token-efficient retrieval.

kb-mcp bridges this gap: index markdown collections, expose them as MCP tools, let agents query and contribute to the knowledge base.

Design Principles

Project-agnostic. The binary knows nothing about any specific vault, project, or directory structure. Everything comes from collections.ron. One binary serves any project with markdown files.

Configuration as data. Section descriptions, collection paths, writable flags — all RON config. No recompilation to change what gets indexed or how it’s described.

Token-efficient by default. kb_context exists because agents shouldn’t read 50 documents to find the 3 that matter. Frontmatter + summary first, full content on demand.

CLI parity. Every MCP tool works as a CLI subcommand. Testing, scripting, and debugging don’t require an MCP client.

Fresh reads over cached content. get_document reads from disk, not the index. Edits are visible immediately. The index is for search and lookup — not content serving.

Simple until proven insufficient. Start with the simplest approach that works. Persistent storage, vector search, and incremental reindex come when the simple approach hits real limits.

Dogfood everything. This project is both the tool and a practical example of using it. The vault documents AI agent memory. The collections.example.ron indexes the project’s own docs. The container agent uses kb-mcp to research and curate the vault. Every feature we build, we also consume — if it doesn’t work well for us, it won’t work well for anyone.

Non-Goals

These reflect the project’s current focus, not permanent boundaries. As kb-mcp matures and usage patterns emerge, any of these could become goals in a future phase.

Not a general-purpose search engine. kb-mcp indexes markdown files with YAML frontmatter. It does not index code, logs, databases, or arbitrary file formats.

Not a document editor. kb_write creates new files. It does not edit, rename, move, or delete existing documents. Vault management stays in the editor (Obsidian, VS Code, etc.).

Not a vector database. Even with hybrid search planned, the primary interface is BM25 keyword search. Vector similarity augments it — it doesn’t replace it.

Not a multi-user system. kb-mcp serves one agent session at a time (stdio is 1:1). HTTP mode may support multiple clients but there’s no authentication, permissions, or multi-tenancy.

Not a replacement for agent memory. kb-mcp serves project knowledge (docs, guides, skills). Agent memory (preferences, session state, learned behaviors) belongs in the agent’s own memory system.

Not a web application. No UI, no dashboard, no browser interface. CLI + MCP is sufficient. Agents are the primary consumers.

Roadmap

What’s been built, what’s next, and when each phase makes sense.

Completed

Standalone MCP Server (v2)

RON-configured binary with 10 MCP tools, full CLI parity, and zero hardcoded project-specific values. Open sourced at github.com/ttdonovan/kb-mcp.

Persistent Storage (memvid-core)

Replaced in-memory Tantivy with memvid-core .mv2 persistent files. Incremental reindex via blake3 content hashing. Smart markdown chunking. Crash-safe WAL.

Containerized Researcher Agent

ZeroClaw container with kb-mcp + DuckDuckGo web search. Writes research findings to vault/drafts/ for human review. IDENTITY.md/SOUL.md define agent personality and quality standards.

Hybrid Search (kb-mcp Phase 2)

BM25 + vector search via memvid-core vec feature. Local ONNX embeddings (BGE-small-en-v1.5), HNSW vector index in .mv2 files, RRF fusion. Opt-in via cargo build --features hybrid. Container supports just agent-build-hybrid.

Vault Intelligence Bundle

Three new MCP tools (kb_digest, kb_query, kb_export) plus transparent auto-reindex via directory mtime checks. Brings the tool count from 6 to 9. kb_write also gained optional directory and filename parameters for hierarchical collection structures.

Cargo Workspace Split

Reorganized from a single binary crate into a three-crate workspace: kb-core (shared library), kb-cli (binary kb), kb-mcp-server (binary kb-mcp). Eliminates CLI/MCP code duplication, guarantees behavioral parity through shared kb_core::format::* functions, and enables independent compilation with clean dependency isolation.

Up Next

Draft Reviewer Agent

A second container agent (or Claude Code sub-agent) that reviews drafts for quality, formatting consistency, source verification, and proper frontmatter — then promotes approved entries into the vault.

Why now: The researcher agent is producing drafts. Manual review is the bottleneck. Automating the quality gate completes the capture pipeline.

Scope:

Read drafts collection, check against vault conventions (SOUL.md standards)
Verify sources are reachable URLs
Ensure frontmatter has required fields (tags, created, updated, sources, target)
Promote approved drafts to the correct vault section
Flag issues for human attention rather than silently fixing

Heartbeat Scheduling

Add HEARTBEAT.md to the researcher agent for automated periodic research.

Why now: The researcher works well manually. Scheduling is a small addition that makes it run on autopilot for configured topics.

Scope:

HEARTBEAT.md defines research topics and frequency
ZeroClaw cron runs the research workflow on schedule
Digest report of what was added (file or notification)

Future

Knowledge Capture Tools (kb-mcp Phase 3)

Specialized write tools beyond free-form kb_write:

kb_capture_session — record debugging/coding sessions
kb_capture_fix — record bug fixes with symptom/cause/resolution
kb_classify — auto-tag unprocessed notes (type, tags, summary)

When: When agents are actively writing and would benefit from structured capture templates.

Gap Analyzer Agent

Reads the existing vault, identifies thin or missing topics, and feeds the researcher agent with specific research requests. The inward-facing complement to the outward-facing researcher.

When: When the vault is large enough that gaps aren’t obvious from browsing.

Knowledge Keeper Agent

Combines researcher + reviewer + gap analyzer into the full Knowledge Keeper pattern. Sweeps sessions, scores knowledge by usefulness, prunes stale entries. The most autonomous form of vault curation.

When: When all three component agents are proven individually.

HTTP Daemon Mode (kb-mcp Phase 4)

Add HTTP transport alongside stdio. Long-lived server eliminates MCP cold starts and enables network-based access (no volume mount needed for container agents).

When: When cold start latency is a problem (likely after hybrid search adds ONNX model loading).

Multiple projects share knowledge through federated .mv2 files. Agents in different repos contribute to and query from shared collections.

When: When multiple projects are actively using kb-mcp and would benefit from shared context.

Configuration

kb-mcp is configured via a collections.ron file that defines what markdown directories to index.

Config File

(
    // Optional: override cache directory (default: ~/.cache/kb-mcp)
    // cache_dir: "~/.cache/kb-mcp",
    collections: [
        (
            name: "docs",
            path: "docs",
            description: "Project documentation",
            writable: false,
            sections: [
                (prefix: "guides", description: "How-to guides"),
                (prefix: "reference", description: "API reference"),
            ],
        ),
        (
            name: "notes",
            path: "notes",
            description: "Working notes",
            writable: true,
            sections: [],
        ),
    ],
)

Fields

Field	Type	Required	Description
`cache_dir`	String	No	Cache directory for index files (default: `~/.cache/kb-mcp`)
`collections`	List	Yes	One or more collection definitions

Collection Fields

Field	Type	Required	Description
`name`	String	Yes	Unique identifier for the collection
`path`	String	Yes	Directory path (relative to config file)
`description`	String	Yes	Human-readable description
`writable`	Bool	No	Allow `kb_write` to create files (default: `false`)
`sections`	List	No	Section definitions for this collection

Section Fields

Field	Type	Required	Description
`prefix`	String	Yes	Directory prefix that identifies this section
`description`	String	Yes	Human-readable description

Resolution Order

kb-mcp searches for configuration in this order:

--config <path> CLI flag (explicit)
KB_MCP_CONFIG environment variable
./collections.ron (current working directory)
~/.config/kb-mcp/collections.ron (user default)

Collection paths resolve relative to the config file’s parent directory.

Cross-Project Use

Install kb-mcp globally, then point other projects at a specific config:

{
  "mcpServers": {
    "kb": {
      "command": "kb-mcp",
      "env": {
        "KB_MCP_CONFIG": "/path/to/project/collections.ron"
      },
      "args": []
    }
  }
}

MCP Server

kb-mcp runs as an MCP server when invoked with no arguments. It communicates over stdio using the JSON-RPC protocol.

Registration

Add to your project’s .mcp.json:

{
  "mcpServers": {
    "kb": {
      "command": "kb-mcp",
      "args": []
    }
  }
}

The binary must be in your PATH (install via just install-server or cargo install --path crates/kb-mcp-server).

Startup

On startup, kb-mcp:

Loads collections.ron (see Configuration)
Scans all collection directories for .md files
Builds the BM25 search index in memory
Starts the MCP stdio transport

Logs go to stderr. Startup typically takes <1s for ~200 documents.

Agent Workflow

A typical agent session:

Call list_sections to see what’s available
Call search to find relevant documents
Call kb_context on promising results to scan frontmatter + summary
Call get_document only on documents worth reading in full
Call kb_write to capture new knowledge (writable collections only)
Call reindex after creating new files mid-session

CLI

The kb binary provides CLI access to all MCP tools. Every MCP tool has a CLI equivalent.

Commands

# List all sections
kb list-sections

# Search
kb search --query "rate limits"
kb search --query "bevy" --collection skills
kb search --query "agents" --scope runtimes
kb search --query "MCP" --max-results 5

# Get full document
kb get-document --path "concepts/mcp-server-pattern.md"

# Token-efficient briefing
kb context --path "concepts/mcp-server-pattern.md"

# Write a note (writable collection only)
kb write --collection notes --title "My Note" --body "Content" --tags "tag1,tag2"

# Rebuild index
kb reindex

Global Options

Flag	Description
`--config <path>`	Path to `collections.ron` config file

Output

All commands output JSON to stdout. Errors go to stderr.

Hybrid Search

By default, kb-mcp uses BM25 keyword search via Tantivy. Enable the hybrid feature to add vector similarity search alongside BM25.

Why Hybrid?

BM25 is excellent for exact keyword queries (“PostgreSQL connection pool”, “rate limit error”). But it misses conceptual matches:

Query	BM25 alone	Hybrid (BM25 + vector)
“how do agents share state?”	Misses “Shared Memory” doc	Matches via semantic similarity
“memory architecture”	Finds docs with those exact words	Also finds “Cognitive Memory Model”
“BM25 ranking”	Works perfectly	Works perfectly (BM25 still contributes)

Hybrid search combines both signals using Reciprocal Rank Fusion (RRF), so keyword precision isn’t lost — it’s augmented.

How It Works

At ingest time: Each document is embedded into a 384-dimensional vector using BGE-small-en-v1.5 (local ONNX, no cloud API)
At query time: The query is embedded, then both BM25 and HNSW vector search run in parallel
Fusion: Results are merged via RRF (score = Σ 1/(k + rank)) with k=60, combining keyword and semantic signals

The vector index is stored inside the .mv2 file alongside the Tantivy BM25 index — no separate database or service.

Setup

1. Install with hybrid feature

cargo install --path . --features hybrid

This pulls in ONNX Runtime and the HNSW library. Build time is longer than the default BM25-only build.

2. Download the embedding model

The BGE-small-en-v1.5 model files (~34MB total) must be present locally:

mkdir -p ~/.cache/memvid/text-models

curl -L -o ~/.cache/memvid/text-models/bge-small-en-v1.5.onnx \
  https://huggingface.co/BAAI/bge-small-en-v1.5/resolve/main/onnx/model.onnx

curl -L -o ~/.cache/memvid/text-models/bge-small-en-v1.5_tokenizer.json \
  https://huggingface.co/BAAI/bge-small-en-v1.5/resolve/main/tokenizer.json

3. Re-index to build vector index

Existing .mv2 files only have BM25 data. Delete the cache to force a full re-ingest with embeddings:

rm -rf ~/.cache/kb-mcp/   # or ~/Library/Caches/kb-mcp/ on macOS
kb-mcp list-sections      # triggers re-ingest with vectors

Usage

No changes to your queries. The search tool automatically uses hybrid mode when compiled with the feature. Agents get better results without knowing about search modes.

# These work the same — but with hybrid, conceptual matches are found
kb-mcp search --query "how do agents share state?"
kb-mcp search --query "memory architecture" --collection vault

Default Build (No Hybrid)

If you don’t need vector search, the default build stays lightweight:

cargo install --path .   # BM25 only, no ONNX dependency

The same search queries work — they just use BM25 keyword matching without the vector similarity signal.

Technical Details

Model: BGE-small-en-v1.5 (BAAI), 384 dimensions
Index: HNSW graph (brute-force below 1000 vectors)
Distance: L2 (Euclidean)
Fusion: Reciprocal Rank Fusion, k=60
Storage: Vectors stored inside .mv2 files alongside Tantivy
Embedding cache: LRU, 1000 entries, auto-unloads after 5min idle

Researcher

Tools

list_sections

List all collections and their sections with document counts and descriptions.

Parameters: None

Returns: JSON array of sections with name, description, doc_count, and collection.

search

Full-text search across the knowledge base using BM25 ranking.

Parameters:

Name	Type	Required	Description
`query`	String	Yes	Search query (supports phrases and boolean operators)
`collection`	String	No	Filter by collection name
`scope`	String	No	Filter by section prefix
`max_results`	Number	No	Maximum results (default: 10)

Returns: JSON with query, total, and results array. Each result has path, title, section, collection, score, and excerpt.

get_document

Retrieve a document by path or title. Content is read fresh from disk.

Parameters:

Name	Type	Required	Description
`path`	String	Yes	Document path or title

Returns: JSON with path, title, tags, section, collection, and content.

kb_context

Token-efficient document briefing. Returns frontmatter metadata and first paragraph summary without the full body.

Call this to survey relevance before using get_document for full content. Saves 90%+ tokens on retrieval-heavy workflows.

Parameters:

Name	Type	Required	Description
`path`	String	Yes	Document path or title

Returns: JSON with path, title, tags, section, collection, frontmatter (all fields), and summary (first paragraph).

kb_write

Create a new document in a writable collection. Generates frontmatter with a date-prefixed filename by default. Use directory to write into subdirectories and filename to specify an exact name without date prefix.

Parameters:

Name	Type	Required	Description
`collection`	String	Yes	Target collection (must be writable)
`title`	String	Yes	Document title
`body`	String	Yes	Document body (markdown)
`tags`	List	No	Tags for frontmatter
`status`	String	No	Status field for frontmatter
`source`	String	No	Source field for frontmatter
`directory`	String	No	Subdirectory within collection (e.g. “concepts/memory”). Created automatically.
`filename`	String	No	Exact filename (e.g. “cognitive-memory-model.md”). Skips date prefix when provided.

Returns: JSON with path, collection, title, and tags.

Errors: Returns actionable error if collection is read-only, not found, or directory escapes the collection root.

kb_digest

Vault summary — shows collections, sections with topics, recent additions (last 7 days), and thin sections (fewer than 2 documents). Use this to understand what the knowledge base covers before searching.

Parameters:

Name	Type	Required	Description
`collection`	String	No	Filter to a specific collection

Returns: JSON with total_documents, total_sections, and collections array. Each collection has name, doc_count, sections (with topics and gap hints), and recent additions.

kb_query

Filter documents by frontmatter fields. Multiple filters combine with AND logic. Returns document metadata without body content.

Parameters:

Name	Type	Required	Description
`tag`	String	No	Filter by tag
`status`	String	No	Filter by frontmatter status field
`created_after`	String	No	YYYY-MM-DD, returns docs created on or after
`collection`	String	No	Filter by collection name
`has_sources`	Boolean	No	Only docs with a sources field

Returns: JSON with total and documents array. Each document has path, title, tags, section, and collection.

kb_export

Export vault as a single markdown document. Concatenates documents with frontmatter headers, limited to max_documents to prevent unbounded output.

Parameters:

Name	Type	Required	Description
`collection`	String	No	Collection to export (default: all)
`max_documents`	Number	No	Maximum documents to include (default: 200)

Returns: Concatenated markdown with document separators and frontmatter metadata. Appends a truncation notice if the limit is hit.

kb_health

Vault health diagnostics — checks document quality across collections. Flags missing frontmatter dates, untagged docs, stale content, stub documents, orphaned notes (no inbound wiki-links), and broken wiki-links. Use kb_digest for coverage overview, kb_health for quality issues.

Parameters:

Name	Type	Required	Description
`collection`	String	No	Filter to a specific collection
`stale_days`	Number	No	Days threshold for staleness (default: 90)
`min_words`	Number	No	Minimum word count for stub detection (default: 50)

Returns: JSON with total_documents_checked, total_issues, and per-collection arrays for each check: missing_created, missing_updated, no_tags, stale, stubs, orphans, broken_links.

reindex

Rebuild the search index from all collections on disk. Use after editing documents mid-session. Note: search now auto-detects new files via directory mtime checks, so reindex is mainly needed after in-place content edits.

Parameters: None

Returns: Summary message with document and section counts.

RON Schema

The full collections.ron schema with all fields:

(
    // Cache directory for index files
    // Default: ~/.cache/kb-mcp
    // Supports ~ expansion
    cache_dir: "~/.cache/kb-mcp",

    collections: [
        (
            // Unique name — used in search filters and kb_write target
            name: "vault",

            // Path to markdown directory
            // Relative to this config file's location
            path: "ai-vault",

            // Description shown in list_sections output
            description: "Primary knowledge vault",

            // Allow kb_write to create files here
            // Default: false
            writable: false,

            // Section definitions — map directory prefixes to descriptions
            // Documents in subdirectories matching a prefix get that section's description
            // Sections without definitions still appear, just without descriptions
            sections: [
                (prefix: "concepts", description: "Cross-cutting concepts"),
                (prefix: "guides", description: "How-to guides"),
            ],
        ),
    ],
)

Rust Types

The RON file deserializes into these Rust types:

#![allow(unused)]
fn main() {
struct Config {
    cache_dir: Option<String>,
    collections: Vec<Collection>,
}

struct Collection {
    name: String,
    path: String,
    description: String,
    writable: bool,       // default: false
    sections: Vec<Section>, // default: []
}

struct Section {
    prefix: String,
    description: String,
}
}

Landscape

A survey of related projects in the AI agent memory and knowledge management space. Understanding what exists helps identify where kb-mcp fits, what patterns to adopt, and where to differentiate.

Quick Comparison

Project	Type	Language	Search	MCP	Write	Local/Cloud
kb-mcp	Knowledge base server	Rust	BM25 + vector hybrid	stdio	Yes	Local
hipocampus	Agent memory harness	JavaScript	BM25 + vector (qmd)	No	Yes	Local
obsidian-web-mcp	Remote vault access	Python	ripgrep full-text	HTTP	Yes	Remote
mengram	Cloud memory service	Python + JS	Semantic (cloud)	Yes	Yes	Cloud
hmem	Hierarchical memory	TypeScript	Tree traversal	stdio	Yes	Local
mnemex	Semantic code search	TypeScript	BM25 + vector (LanceDB)	stdio	No	Local
cq	Agent knowledge commons	Python + TS	FTS5 + domain tags	stdio	Yes	Local + Team API
prism-mcp	Agent session memory	TypeScript	FTS5 + sqlite-vec + TurboQuant	stdio	Yes	Local or Cloud

Source Code Metrics

Measured with tokei. Source-only — excludes eval/benchmark data, web frontends, generated files, and node_modules.

Project	Language	Files	Code Lines	Comments	Total Lines
kb-mcp	Rust	17	1,696	36	2,001
hipocampus	JavaScript	3	730	96	954
obsidian-web-mcp	Python	18	1,469	43	1,895
mengram	Python (core)	48	22,097	1,007	25,907
hmem	TypeScript	10	6,569	577	7,617
mnemex	TypeScript + TSX (src/)	388	86,919	20,113	120,031
cq	Python + TypeScript	81	10,129	1,667	14,378
prism-mcp	TypeScript + SQL	121	28,367	8,587	40,977

Notes: mengram includes cloud backend, SDKs, and integrations in its Python source. mnemex src/ includes the core engine, CLI, and MCP server — the full repo (329K lines) also contains eval benchmarks (91K), AI docs (15K), landing page, and VS Code extension.

Documentation Metrics

Markdown files as a proxy for documentation investment.

Project	.md Files	Lines	Doc-to-Code Ratio
kb-mcp	52	4,817	2.8x
hipocampus	16	1,657	2.3x
obsidian-web-mcp	1	198	0.1x
mengram	17	1,871	0.1x
hmem	11	2,710	0.4x
mnemex	74	23,808	0.3x
cq	12	2,280	0.2x
prism-mcp	9	1,466	0.05x

Observations:

kb-mcp has the highest doc-to-code ratio (2.8x) — more lines of documentation than source code. This reflects the project’s dual role as both a tool and a documented reference implementation.
hipocampus also documents heavily (2.3x) — its memory architecture and protocol are extensively explained in markdown.
obsidian-web-mcp has minimal docs (single README) despite a substantial codebase — the code is the documentation.
mnemex has extensive docs (74 files, 24K lines) but the ratio is low because the codebase is so large.

Where kb-mcp Fits

kb-mcp occupies a specific niche: local, Rust-native, zero-infrastructure knowledge base server for curated markdown. It’s not trying to be agent session memory (hipocampus, hmem), a cloud service (mengram), or a code search tool (mnemex).

The closest overlap is with obsidian-web-mcp (both serve markdown vaults via MCP), but they differ on transport (local stdio vs remote HTTP) and search quality (BM25-ranked vs ripgrep grep).

The memory-focused projects (hipocampus, mengram, hmem, prism-mcp) are complementary rather than competitive — they manage agent session memory, while kb-mcp serves reference knowledge. An agent could use hmem for working memory and kb-mcp for its knowledge base.

cq represents a third category: collective agent learning. Where kb-mcp serves curated human-authored knowledge and memory projects persist agent state, cq captures wisdom that emerges from agent sessions and shares it across agents. All three categories coexist naturally.

hipocampus

Source: github.com/kevin-hs-sohn/hipocampus Language: JavaScript | Status: Active (26 stars)

Drop-in memory harness for AI agents with a 3-tier memory architecture and 5-level compaction tree.

What It Does

hipocampus manages agent session memory over time. Hot memory (~500 lines) is always loaded. Warm memory (daily logs, knowledge base, plans) is read on demand. Cold memory is searched via qmd hybrid search.

The key innovation is the 5-level compaction tree: raw daily logs get compressed into daily → weekly → monthly → root summaries via LLM-driven summarization. A ROOT.md topic index (~100 lines) gives agents O(1) awareness of what they know.

Key Features

3-tier memory: Hot (always loaded), Warm (on-demand), Cold (search)
5-level compaction tree with LLM-driven summarization
ROOT.md topic index for constant-time knowledge awareness
Hybrid search via qmd (BM25 + vector)
Claude Code plugin marketplace integration
Pre-compaction hooks for automatic memory preservation
File-based, no database

Comparison to kb-mcp

Aspect	hipocampus	kb-mcp
Primary use	Agent session memory	Curated knowledge base
Data model	Daily logs → compacted summaries	Markdown collections indexed for search
Search	qmd (BM25 + vector)	memvid-core (BM25 + optional vector)
Write pattern	Continuous (daily logs, auto-compaction)	On-demand (kb_write, manual curation)
MCP support	No (skill-based)	Yes (stdio transport)

Relationship: Complementary. hipocampus handles what the agent remembers from sessions; kb-mcp serves what the agent looks up in reference material.

Patterns Worth Adopting

Compaction tree — the 5-level summarization pattern is relevant for kb-mcp’s future Knowledge Keeper agent
ROOT.md topic index — a constant-cost “what do I know?” summary could complement list_sections

obsidian-web-mcp

Source: github.com/jimprosser/obsidian-web-mcp Language: Python | Status: Active (71 stars)

Secure remote MCP server for Obsidian vaults with OAuth 2.0 auth, Cloudflare Tunnel, and atomic writes safe for Obsidian Sync.

What It Does

obsidian-web-mcp makes your Obsidian vault accessible from anywhere — Claude web, mobile, and desktop — via an HTTP MCP endpoint proxied through Cloudflare Tunnel with OAuth 2.0 PKCE authentication.

Key Features

9 MCP tools: read, batch read, write, frontmatter update, search, frontmatter search, list, move, soft-delete
Remote access via Cloudflare Tunnel + OAuth 2.0 PKCE
Atomic writes (write-to-temp-then-rename) safe for Obsidian Sync
In-memory frontmatter index with filesystem watcher for auto-updates
ripgrep for full-text search (falls back to Python if unavailable)
Path traversal protection, safety limits (1MB/file, 20 files/batch)
launchd plist for macOS always-on deployment

Comparison to kb-mcp

Aspect	obsidian-web-mcp	kb-mcp
Transport	HTTP (remote via Cloudflare)	stdio (local)
Search	ripgrep grep (no ranking)	BM25 ranked + optional vector
Vault operations	Rich (batch read, move, delete, frontmatter)	Focused (search, read, write, context)
Auth	OAuth 2.0 PKCE	None (local only)
Obsidian-specific	Yes (Sync-safe, .trash, frontmatter index)	No (any markdown)
Token efficiency	No equivalent	kb_context (frontmatter + summary)

Relationship: Different problem domains. obsidian-web-mcp solves remote vault access; kb-mcp solves effective knowledge search. Both serve Obsidian vaults but from opposite directions.

Patterns Worth Adopting

Filesystem watcher — auto-reindex when files change (instead of manual reindex calls)
Frontmatter index — in-memory YAML index for structured queries beyond full-text search
HTTP transport — relevant for kb-mcp’s Phase 4 HTTP daemon mode

mengram

Source: github.com/alibaizhanov/mengram Language: Python + JS SDKs | Status: Active (112 stars)

Human-like memory for AI agents with semantic, episodic, and procedural memory types — procedures evolve from failures.

What It Does

mengram is a cloud-hosted memory service that gives AI agents persistent, personalized memory across sessions. Its key differentiator is procedural memory — workflows that automatically evolve when they fail, creating an improvement loop.

Key Features

3 memory types: Semantic (facts), Episodic (events), Procedural (evolving workflows)
Cognitive Profile — persistent user profile loaded at session start
Claude Code hooks: auto-save after responses, auto-recall on prompts
File upload (PDF, DOCX, TXT, MD) with vision AI extraction
Knowledge graph
Multi-user isolation
Import from ChatGPT / Obsidian
Python + JavaScript SDKs, REST API
LangChain, CrewAI, MCP integrations
Free tier available

Comparison to kb-mcp

Aspect	mengram	kb-mcp
Hosting	Cloud (mengram.io)	Local (your machine)
Data control	Third-party cloud	On-disk, fully private
Memory model	3-tier cognitive (semantic/episodic/procedural)	Document collections with sections
Search	Semantic (cloud API)	BM25 + optional local vector
Auto-capture	Yes (Claude Code hooks)	No (manual or agent-driven)
Dependencies	API key + network	Zero (Rust binary)

Relationship: Different trust and deployment models. mengram is convenient (auto-save, cloud sync, SDKs) but sends your data to a third party. kb-mcp keeps everything local and private.

Patterns Worth Adopting

Procedural memory — workflows that evolve from failure analysis. Relevant to the vault’s Knowledge Keeper pattern.
Cognitive Profile — a structured “who is the user” document. Claude Code’s memory system does something similar.
Auto-save hooks — capturing knowledge without manual intervention. The researcher agent’s heartbeat scheduling aims at this.

hmem

Source: github.com/Bumblebiber/hmem Language: TypeScript | Status: Active (9 stars)

MCP server with 5-level lazy-loaded SQLite memory modeled after human memory hierarchy — agents load only the detail level they need.

What It Does

hmem stores agent memories in a hierarchical tree with 5 levels of detail. Level 1 is a coarse summary (always loaded on agent spawn). Levels 2-5 provide progressively more detail, fetched on demand. This saves tokens by giving agents awareness without loading everything.

Key Features

5-level hierarchical memory (coarse → verbatim)
Tree structure with compound IDs (e.g., L0003.2.1)
Markers: favorite, pinned, obsolete, irrelevant, active, secret
Obsolete entries hidden from bulk reads but remain searchable
Session cache with Fibonacci decay (suppresses already-seen entries)
Access-count promotion (most-accessed entries auto-expand)
Import/export as Markdown or SQLite
Per-agent memory files (.hmem)
Curator agent concept for periodic maintenance
MCP over stdio (Claude Code, Gemini CLI, Cursor, Windsurf, OpenCode)

Comparison to kb-mcp

Aspect	hmem	kb-mcp
Data model	Hierarchical tree in SQLite	Flat markdown collections
Search	Tree traversal by ID (no ranking)	BM25 ranked + optional vector
Token efficiency	5 detail levels, load only what’s needed	kb_context (frontmatter + summary)
Storage	SQLite per agent	memvid-core .mv2 per collection
Write pattern	write/update/append memories	kb_write creates markdown files
Maintenance	Curator agent, Fibonacci decay, access promotion	Manual or researcher agent

Relationship: Complementary. hmem excels at structured agent working memory (what am I doing, what did I decide). kb-mcp excels at reference knowledge search (what does the documentation say about X).

Patterns Worth Adopting

Lazy-loaded detail levels — the 5-level hierarchy is a powerful token-saving pattern. kb_context is a 2-level version of this (summary vs full document).
Obsolete-but-searchable — marking entries as outdated without deleting them. Useful for vault knowledge that may be superseded.
Access-count promotion — frequently accessed documents could be surfaced more prominently in search results.
Fibonacci decay — suppressing recently-seen results in repeated queries to surface new content.

mnemex

Source: github.com/MadAppGang/mnemex Language: TypeScript | Status: Active (35 stars)

Local semantic code search for Claude Code — tree-sitter parsing, embedding-based vector search + BM25 hybrid, stored in LanceDB.

What It Does

mnemex indexes codebases using tree-sitter to understand code structure (functions, classes, methods), embeds chunks via configurable providers (OpenRouter, Ollama, custom), and serves hybrid BM25 + vector search over MCP. It’s code search, not document search.

Key Features

Hybrid search: BM25 + vector similarity via LanceDB
Tree-sitter code parsing (structure-aware, not naive line splits)
Embedding flexibility: OpenRouter (cloud), Ollama (local), custom
Embedding model benchmarking tool with NDCG scores
Auto-reindex on search (detects modified files)
Symbol graph with PageRank for importance ranking
mnemex pack — export codebase to a single AI-friendly file
4 MCP tools: search_code, index_codebase, get_status, clear_index
Claude Code plugin, OpenCode plugin, VS Code autocomplete

Comparison to kb-mcp

Aspect	mnemex	kb-mcp
Domain	Code search	Document/knowledge search
Parsing	Tree-sitter (code-aware chunks)	Markdown (frontmatter + heading structure)
Hybrid search	BM25 + vector (LanceDB)	BM25 + vector (memvid-core)
Embeddings	OpenRouter / Ollama / custom	Local ONNX (BGE-small-en-v1.5)
Write-back	No (read-only)	Yes (kb_write)
Auto-reindex	Yes (on search)	No (manual reindex or startup sync)
Unique feature	Symbol graph + PageRank	Token-efficient kb_context

Relationship: Different domains (code vs knowledge). Both use hybrid BM25 + vector search but with different backends and parsing strategies.

Patterns Worth Adopting

Auto-reindex on search — detect file changes at query time instead of requiring explicit reindex calls. Low overhead for small vaults.
Embedding model benchmarking — a tool to evaluate search quality with different models and parameters.
Pack/export — exporting the full knowledge base as a single context-friendly file for LLM ingestion.

Ori-Mnemos

Source: github.com/aayoawoyemi/Ori-Mnemos Language: TypeScript | Status: Active (62 stars)

Persistent cognitive memory system for AI agents — knowledge graph with learning layers, identity resources, and adaptive retrieval.

What It Does

Ori-Mnemos treats agent memory as a learning problem, not a lookup problem. It builds a knowledge graph from markdown files (wiki-links + learned co-occurrence edges), runs 4-signal retrieval (semantic + BM25 + PageRank + warmth), and continuously improves via three learning layers that reshape the graph with every interaction.

Agents get persistent identity (goals, methodology, reminders) and a memory system that decays, reinforces, and prunes like biological memory. All local — markdown + SQLite, no cloud dependencies.

Key Features

4-signal RRF fusion: semantic embeddings, BM25, Personalized PageRank, associative warmth
3 learning layers: Q-value reranking, co-occurrence edge learning (Hebbian/NPMI), stage meta-learning (LinUCB)
Knowledge graph: wiki-links + learned co-occurrence edges with homeostasis normalization
3 memory zones: identity (slow decay), knowledge (1x), operations (fast decay)
Agent identity resources: personality, goals, methodology, daily context, reminders
16 MCP tools + 5 identity resources + 16 CLI commands
Local embeddings: all-MiniLM-L6-v2 via Hugging Face transformers
Storage: markdown files + SQLite (indexes and learning state)
579+ tests (vitest)

Notable Tools

Tool	Purpose
`ori_orient`	Daily briefing — status, reminders, goals, vault health
`ori_query_ranked`	Full retrieval with Q-value reranking + stage meta-learning
`ori_explore`	Recursive graph exploration with sub-question decomposition
`ori_warmth`	Associative field showing resonant notes in context
`ori_promote`	Graduate inbox notes to typed notes with classification
`ori_query_fading`	Low-vitality candidates for archival

Comparison to kb-mcp

Aspect	Ori-Mnemos	kb-mcp
Primary use	Agentic memory with learning	Curated knowledge base retrieval
Language	TypeScript	Rust
Storage	Markdown + SQLite	Markdown + memvid-core .mv2
Search	4-signal RRF (semantic + BM25 + PageRank + warmth)	BM25 + optional vector (memvid-core)
Learning	3 layers (Q-value, co-occurrence, stage meta)	None — static index
Graph	Wiki-links + learned edges + PageRank	Section hierarchy only
Identity	Goals, methodology, reminders, daily context	Not supported
Decay/vitality	3 memory zones with configurable decay rates	None
Tools	16 MCP + 5 resources	9 MCP tools
CLI parity	Yes (dual-mode)	Yes (dual-mode)
Auto-reindex	Incremental embedding updates	Directory mtime detection
Recall@5	90% (HotpotQA multi-hop)	Baseline BM25
Latency	~120ms (full intelligence)	Sub-100ms (BM25)

Relationship: Ori is a superset in ambition — it does everything kb-mcp does (markdown indexing, search, MCP tools) plus graph-based learning, identity management, and adaptive retrieval. kb-mcp is simpler, faster, and Rust-native. They solve overlapping but different problems: kb-mcp is a library you search; Ori is a brain that learns.

Patterns Worth Adopting

Q-value learning on retrieval — tracking which search results agents actually use (forward citations, re-recalls, dead-ends) to improve future ranking. Could inform a future kb-mcp relevance layer.
Co-occurrence edges — notes retrieved together form stronger associations. Lightweight to implement on top of existing search.
Identity resources — ori://identity, ori://goals etc. give agents persistent context. kb-mcp’s kb_context serves a similar purpose but without the identity layer.
Memory zones with decay — different decay rates for identity vs operational knowledge. Relevant for kb-mcp’s future Knowledge Keeper.
Stage meta-learning — learning to skip expensive retrieval stages when they don’t contribute value. Relevant if kb-mcp adds more retrieval signals beyond BM25.
Vault health diagnostics — ori_health checks index freshness, orphan notes, dangling links. Could complement kb-mcp’s kb_digest.

cq

Source: github.com/mozilla-ai/cq Language: Python + TypeScript | Status: v0.4.0, Active (Mozilla AI)

Shared knowledge commons for AI agents — collective learning so agents stop independently rediscovering the same failures. Built by Mozilla AI.

What It Does

cq captures “knowledge units” (KUs) that emerge from agent sessions and makes them queryable by other agents. Agents propose insights, confirm what works, flag what’s wrong, and reflect at session end. Knowledge graduates from local (private) to team (org-shared, human-reviewed) to global (public commons — not yet implemented).

The core thesis: agents worldwide burn tokens rediscovering the same failures. A shared commons eliminates redundant learning.

Key Features

6 MCP tools: query, propose, confirm, flag, reflect, status
SQLite + FTS5 local store with domain tag Jaccard similarity scoring
Confidence scoring: confirmations boost (+0.1), flags penalize (-0.15)
Tiered graduation: local → team (human-in-the-loop review) → global
Team API (FastAPI) with React review dashboard
Post-error hook auto-queries commons before agent retries
Session-end reflect mines conversations for shareable insights
Claude Code plugin (SKILL.md behavioral protocol + hooks.json)
69KB proposal document covering trust layers, DID identity, ZK proofs

Comparison to kb-mcp

Aspect	cq	kb-mcp
Domain	Agent collective knowledge	Curated document search
Content source	Agent-generated (propose/confirm/flag)	Human-authored markdown
Search	FTS5 + domain tag Jaccard	BM25 (Tantivy)
Write model	Propose → confirm/flag loop	kb_write to writable collections
Storage	SQLite (local) + team API (cloud)	Tantivy index + disk reads
Unique feature	Confidence scoring via peer confirmation	Token-efficient kb_context
Tool count	6	10

Relationship: Complementary. kb-mcp serves curated reference knowledge (“what does our API spec say?”); cq serves collective agent wisdom (“what gotchas have agents hit with this API?”). They’d coexist naturally.

Patterns Worth Adopting

Confirmation/flagging feedback — lightweight signals for “this document helped” or “this seems stale” could inform search ranking or kb_health.
Session reflection mining — a “reflect and write” workflow that mines a session for knowledge to capture via kb_write.
Post-error auto-lookup — hook-based auto-search when agents encounter unfamiliar territory.
Domain tag scoring — combining tag-based Jaccard similarity with text-based BM25 could improve kb_query relevance.

Source Metrics

Component	Language	Files	Code Lines	Comments	Total Lines
MCP server + team API	Python	27	5,453	98	6,564
Dashboard	TypeScript + TSX	21	1,574	11	1,735
Docs	Markdown	12	—	1,529	2,280
Total		81	10,129	1,667	14,378

Doc-to-code ratio: 0.2x (lean docs relative to codebase, though the 69KB proposal document is the real design investment).

prism-mcp

Source: github.com/dcostenco/prism-mcp Language: TypeScript | Status: v5.1.0, Very Early (solo author)

“Mind Palace” for AI agents — persistent session memory with behavioral learning, time travel, multi-agent sync, and a visual dashboard.

What It Does

prism-mcp gives agents persistent memory across conversations through three layers: an append-only session ledger (what happened), mutable handoff state with optimistic concurrency control (what’s current), and behavioral memory that learns from corrections (what to avoid). High- importance lessons can auto-graduate into .cursorrules / .clauderules.

Key Features

30+ MCP tools across session, memory, search, and dashboard domains
Three-layer memory: session ledger, handoff state (OCC versioned), behavioral
Three-tier search: FTS5, sqlite-vec vectors, TurboQuant JS fallback
TurboQuant: pure-TS embedding compression (ICLR 2026) — 768-dim from 3,072 bytes to ~400 bytes (7x), >90% top-1 retrieval accuracy
Time travel via versioned handoff snapshots (memory_checkout)
Multi-agent hivemind with role isolation (dev/qa/pm)
Behavioral learning: corrections accumulate importance, auto-surface
Progressive context loading: quick/standard/deep tiers
Web dashboard at localhost:3000 (knowledge graph, timeline, health)
Morning briefings after 4+ hours of inactivity
SQLite (local) or Supabase (cloud) backends

Comparison to kb-mcp

Aspect	prism-mcp	kb-mcp
Domain	Agent session memory	Curated document search
Content source	Agent-generated session logs	Human-authored markdown
Search	FTS5 + sqlite-vec + TurboQuant	BM25 (Tantivy)
Write model	Append ledger + upsert handoff	kb_write to writable collections
Storage	SQLite or Supabase	Tantivy index + disk reads
Unique feature	Behavioral learning + time travel	Token-efficient kb_context
Tool count	30+	10

Relationship: Different domains entirely. kb-mcp retrieves curated knowledge; prism-mcp persists agent session state. Complementary — an agent would use both simultaneously.

Patterns Worth Adopting

Progressive context loading — formalized quick/standard/deep tiers for kb_context could help agents pick the right depth.
Optimistic concurrency control — relevant if kb-mcp ever supports concurrent writers to the same collection.
Health check with auto-repair — extending kb_health to suggest or apply fixes, not just diagnose.

Source Metrics

Component	Language	Files	Code Lines	Comments	Total Lines
Core server	TypeScript	69	17,012	6,414	26,141
Migrations	SQL	14	2,227	670	3,207
Tests	Python	15	2,150	268	2,754
Docs	Markdown	9	—	1,009	1,466
Total		121	28,367	8,587	40,977

Doc-to-code ratio: 0.05x. The codebase is large relative to documentation. Notable: single-author v1→v5 in 3 days with 96KB handler files suggests rapid feature accretion. The 30+ tool count is unusually high for an MCP server and may cause prompt bloat.

Agentic Design Patterns

Source: github.com/Mathews-Tom/Agentic-Design-Patterns Type: Reference book (21 chapters) | Status: Active

Comprehensive open-source book covering agentic AI patterns — memory management, learning, MCP, RAG, multi-agent collaboration, tool use, planning, guardrails, and more. Grounded in Google ADK, LangChain, and LangGraph with hands-on code examples.

This page maps the book’s patterns against kb-mcp to identify what we’re doing well, where gaps exist, and what belongs in future work.

Relevant Chapters

Chapter	Topic	Relevance to kb-mcp
Ch 5	Tool Use (Function Calling)	Direct — kb-mcp’s 10 MCP tools follow this pattern
Ch 8	Memory Management	Core — defines short-term vs long-term memory architecture
Ch 9	Learning and Adaptation	Informs future agent memory project
Ch 10	Model Context Protocol (MCP)	Direct — validates kb-mcp’s MCP implementation
Ch 14	Knowledge Retrieval (RAG)	Direct — kb-mcp implements the RAG pattern

What kb-mcp Gets Right

MCP Implementation (Ch 10)

The book describes MCP as a “universal adapter” with client-server architecture, tool discovery, and standardized communication. kb-mcp follows this exactly — stdio transport, #[rmcp::tool] with JsonSchema params, structured JSON output.

The book warns about wrapping legacy APIs without making them “agent-friendly” — returning formats agents can’t parse (like PDFs instead of markdown). kb-mcp avoids this: all output is structured JSON, tool descriptions guide agent behavior, and kb_context provides token-efficient previews before full retrieval.

RAG Pattern (Ch 14)

kb-mcp implements the core RAG pipeline the book describes:

Chunking — smart markdown chunking via memvid-core
Embeddings — optional BGE-small-en-v1.5 via hybrid feature
Vector storage — persistent .mv2 files
Retrieval — BM25 + semantic search with RRF fusion

The book’s “Agentic RAG” pattern — where an agent reasons about retrieval quality — maps to how agents use kb-mcp’s progressive disclosure chain: kb_digest (vault overview) then search (find candidates) then kb_context (preview metadata) then get_document (full content). Each step lets the agent decide whether to go deeper.

Tool Use Pattern (Ch 5)

The book’s tool use lifecycle matches kb-mcp exactly:

Tool definitions with descriptions and typed parameters
LLM decides which tool to call based on the task
Structured output (JSON) with the tool result
LLM processes the result and decides next steps

kb-mcp’s “primitives over workflows” approach is validated — tools are composable building blocks, not opinionated workflows. An agent can combine search + kb_query + get_document in whatever order serves the task.

Where Gaps Exist

No Short-Term Memory / Session State (Ch 8)

The book’s primary memory pattern is the dual memory system:

Short-term — session context, recent interactions, task state
Long-term — persistent knowledge store, searchable repository

kb-mcp provides the long-term side (search, retrieval, export) but has zero session awareness. It doesn’t know what the agent searched for previously, which documents were already retrieved, or what the agent’s current goal is. Every tool call is stateless.

The book’s ADK framework solves this with Session (chat thread), State (temporary key-value data with scoped prefixes), and MemoryService (long-term searchable store). LangGraph uses InMemoryStore with namespaced keys.

Assessment: This is not a gap in kb-mcp — it’s a gap in the system architecture. Session state belongs in the agent framework (ADK, LangGraph, Claude Code), not the knowledge base. kb-mcp is the long-term memory store; the framework provides short-term context.

No Learning Loop (Ch 9)

The book describes agents that improve through:

Reinforcement learning — rewards for good outcomes
Memory-based learning — recalling past experiences
Self-modification — agents editing their own behavior

kb-mcp is completely static — search results don’t improve based on which documents agents actually use. The Q-value pattern from Ori-Mnemos maps to Chapter 9’s “Memory-Based Learning” category.

Assessment: Learning belongs in a future agent memory project, not kb-mcp. See the Ori-Mnemos analysis for the pattern evaluation.

No Memory Type Distinction (Ch 8)

The book identifies three types of long-term memory:

Semantic memory — facts and concepts (domain knowledge)
Episodic memory — past experiences (successful task patterns)
Procedural memory — rules and behaviors (system prompts)

kb-mcp treats all vault content as undifferentiated documents. The section-based organization (concepts/, patterns/, drafts/) is a weak form of semantic categorization, but there’s no support for episodic memory (session transcripts, successful patterns) or procedural memory (agent instructions that evolve).

Assessment: kb-mcp’s vault could map collections to memory types (e.g., a sessions/ collection for episodic, prompts/ for procedural), but the tool doesn’t enforce or leverage the distinction. This is an interesting pattern for the future agent memory project.

No Graph-Based Retrieval (Ch 14 — GraphRAG)

The book describes GraphRAG as superior for “complex questions that require synthesizing data from multiple sources.” kb-mcp has wiki-link parsing in kb_health but doesn’t use links for search ranking. Ori-Mnemos implements this with Personalized PageRank over wiki-link + co-occurrence edges.

Assessment: If kb-mcp ever needs better multi-hop retrieval, the wiki-link graph from kb_health could be reused as a search signal. Low priority — BM25 + vector is sufficient for most knowledge base queries.

What kb-mcp Should NOT Adopt

Pattern	Reason
Session/State management	Agent framework’s job (ADK, LangGraph), not the knowledge base
Self-modification (SICA, Ch 9)	Far beyond scope — kb-mcp is a retrieval tool
Cloud memory services (Vertex, Ch 8)	kb-mcp is local-first by design
Complex learning pipelines (Ch 9)	Belongs in a separate project per the Ori-Mnemos brainstorm

Key Insight: kb-mcp as Knowledge Retrieval, Not Memory

The book’s memory management chapter (Ch 8) defines a dual architecture:

Agent Framework (ADK / LangGraph / Claude Code)
├── Short-term: Session context, state, recent history
└── Long-term: Persistent knowledge store ← kb-mcp serves this role

kb-mcp is a knowledge base server that agents use for long-term knowledge retrieval — domain knowledge, reference material, documented solutions. It is not an agent memory system (as stated in GOALS.md: “Not a replacement for agent memory”). The distinction matters: agent memory includes session state, learned preferences, and identity — things that belong in the agent framework or a dedicated memory system.

This validates both kb-mcp’s focused scope and the conclusion from the Ori-Mnemos brainstorm: agent memory (session state, learning, identity) belongs in a separate project that could use kb-mcp as its knowledge retrieval layer.

One Pattern Worth Exploring

Chapter 9 describes “Knowledge Base Learning Agents” that “leverage RAG to maintain a dynamic knowledge base of problem descriptions and proven solutions.” This is exactly what kb-mcp’s docs/solutions/ directory does via the /ce:compound workflow — but manually. An agent could automate this: after solving a problem, write the solution to the vault via kb_write. The researcher agent already does something similar for external content.

This aligns with the Roadmap’s “Knowledge Capture Tools (Phase 3)” — specialized write tools like kb_capture_session and kb_capture_fix that would automate structured solution capture. The pattern stays within kb-mcp’s identity (it’s writing to a knowledge base, not managing agent state) while enabling the knowledge accumulation loop the book describes.

Landscape Review Process

The AI agent memory space moves fast. This landscape section needs periodic review to stay useful for research and feature planning.

How to Add a New Project

Clone into sandbox/ — gitignored, won’t pollute the repo

git clone --depth 1 https://github.com/org/project.git sandbox/project

Run tokei for codebase metrics
```
tokei sandbox/project/
```
Create a book page at book/src/landscape/project-name.md with:
- One-line description
- Source URL and language
- Key features list
- Comparison table vs kb-mcp
- Relationship (competitive, complementary, or different domain)
- “Patterns Worth Adopting” — what we could learn from them
Add to SUMMARY.md under the Landscape section
Update the overview — add to the quick comparison table and codebase metrics table in landscape/overview.md
Update vault/tools/retrieval-landscape.md if the project is an MCP-native retrieval tool

When to Review

Monthly: Quick scan for new projects — check GitHub trending, Reddit r/clawdbot, ClawHub, and Hacker News for new MCP memory tools
Before planning a new feature: Check if any landscape project already solved it — adopt patterns, don’t reinvent
After a major release: Update metrics and comparison tables

What to Look For

When evaluating a new project:

Question	Why it matters
Is it MCP-native?	Direct comparison to kb-mcp’s tool surface
Local or cloud?	Trust model and deployment alignment
What search does it use?	BM25, vector, hybrid, ripgrep, or none
Does it support write-back?	Agent-driven knowledge capture
What’s the data model?	Files, SQLite, cloud API, knowledge graph
What language?	Ecosystem alignment (Rust, TypeScript, Python)
What’s unique?	Patterns worth adopting for our roadmap

Regenerating Metrics

Requires tokei (brew install tokei).

# Clone all landscape projects (first time only)
cd sandbox
git clone --depth 1 https://github.com/kevin-hs-sohn/hipocampus.git
git clone --depth 1 https://github.com/jimprosser/obsidian-web-mcp.git
git clone --depth 1 https://github.com/alibaizhanov/mengram.git
git clone --depth 1 https://github.com/Bumblebiber/hmem.git
git clone --depth 1 https://github.com/MadAppGang/mnemex.git
cd ..

# Run comparison
just loc-landscape

Update the metrics table in landscape/overview.md with the new numbers.

Current Landscape (as of 2026-03-20)

Project	GitHub	Status
hipocampus	kevin-hs-sohn/hipocampus	Active
obsidian-web-mcp	jimprosser/obsidian-web-mcp	Active
mengram	alibaizhanov/mengram	Active
hmem	Bumblebiber/hmem	Active
mnemex	MadAppGang/mnemex	Active

Architecture

Overview

kb-mcp is a Cargo workspace with three crates:

kb-core (library) — types, config, indexing, search, formatting. No transport deps.
kb-cli (binary kb) — Clap subcommands, JSON to stdout
kb-mcp-server (binary kb-mcp) — MCP stdio server via rmcp

Both binaries share all logic through kb-core. The only difference is the transport layer.

flowchart TD
    CLI[kb-cli] --> CORE[kb-core]
    SRV[kb-mcp-server] --> CORE
    CORE --> FMT[format.rs — shared output]
    CORE --> IDX[index.rs — scanning]
    CORE --> SE[search.rs — BM25]
    CLI --> OUT[stdout — JSON]
    SRV --> MCP[rmcp stdio — JSON-RPC]

Startup Sequence

flowchart TD
    A[Load collections.ron] --> A1[Find config file]
    A1 --> A2[Parse RON]
    A2 --> A3[Resolve relative paths]
    A3 --> B[Build document index]
    B --> B1[Walk collection dirs]
    B1 --> B2[Parse YAML frontmatter]
    B2 --> B3[Extract titles + detect sections]
    B3 --> C[Build search engine]
    C --> C1[Create Tantivy schema]
    C1 --> C2[Index all documents in RAM]
    C2 --> C3[Commit index]
    C3 --> E{Enter mode}
    E -->|CLI| F[Parse args, execute, exit]
    E -->|MCP| G[serve stdio, wait for disconnect]

    style A fill:#4a9eff,color:#fff
    style B fill:#4a9eff,color:#fff
    style C fill:#4a9eff,color:#fff
    style E fill:#f59e0b,color:#fff

Module Map

crates/
├── kb-core/src/
│   ├── lib.rs       AppContext, init(), sync_stores()
│   ├── config.rs    RON config loading, path resolution, discovery chain
│   ├── types.rs     Core data types (Document, Section)
│   ├── index.rs     Filesystem scanning, frontmatter parsing, section building
│   ├── store.rs     .mv2 lifecycle, content hashing, incremental sync
│   ├── search.rs    BM25 search engine (memvid-core)
│   ├── format.rs    JSON output structs and serialization helpers
│   ├── query.rs     Frontmatter filtering logic (shared between CLI and MCP)
│   └── write.rs     slugify_title, find_available_path (shared utilities)
├── kb-cli/src/
│   └── main.rs      Clap parser and CLI command dispatch → kb_core::* calls
└── kb-mcp-server/src/
    ├── main.rs      MCP stdio server startup
    ├── server.rs    KbMcpServer struct, auto-reindex, ServerHandler impl
    └── tools/
        ├── mod.rs       Router composition (sections + documents + search + ...)
        ├── sections.rs  list_sections — collection/section inventory
        ├── documents.rs get_document — full content retrieval (fresh from disk)
        ├── search.rs    search — BM25 full-text with auto-reindex
        ├── context.rs   kb_context — frontmatter + summary (token-efficient)
        ├── write.rs     kb_write — create files in writable collections
        ├── reindex.rs   reindex — rebuild index from disk
        ├── digest.rs    kb_digest — vault summary with topics, recency, gap hints
        ├── query.rs     kb_query — frontmatter filtering (tag, status, date, sources)
        ├── export.rs    kb_export — concatenate vault into single markdown document
        └── health.rs    kb_health — vault health diagnostics (quality, orphans, broken links)

Data Model

erDiagram
    Config ||--o{ Collection : contains
    Collection ||--o{ SectionDef : defines
    Collection ||--o{ Document : indexes
    Document }o--o| SectionDef : "belongs to"

    Config {
        string cache_dir
    }
    Collection {
        string name
        string path
        string description
        bool writable
    }
    SectionDef {
        string prefix
        string description
    }
    Document {
        string path
        string title
        string body
        string section
        string collection
        list tags
        map frontmatter
    }

collections.ron
  └── Collection[]
        ├── name: String          unique identifier
        ├── path: String          directory (relative to config)
        ├── description: String   shown in list_sections
        ├── writable: bool        enables kb_write
        └── sections: SectionDef[]
              ├── prefix: String      matches first subdirectory
              └── description: String shown in list_sections

Document (in-memory, from scanning)
  ├── path: String            relative to collection root
  ├── title: String           from H1 heading or filename
  ├── tags: Vec<String>       from YAML frontmatter
  ├── body: String            content without frontmatter
  ├── section: String         first directory component
  ├── collection: String      owning collection name
  └── frontmatter: HashMap    all YAML fields (for kb_context)

Section (derived)
  ├── name: String            directory prefix
  ├── description: String     from RON config (or empty)
  ├── doc_count: usize        documents in this section
  └── collection: String      owning collection name

Config Resolution

The config discovery chain runs in order, first match wins:

1. --config <path>                    explicit CLI flag
2. $KB_MCP_CONFIG                     environment variable
3. ./collections.ron                  current working directory
4. ~/.config/kb-mcp/collections.ron   user default

Collection paths in the RON file resolve relative to the config file’s parent directory. This means the same binary works from any working directory as long as the config paths are correct relative to the config.

Search Architecture

flowchart TD
    Q[Query string] --> QP[QueryParser]
    QP -->|title + body + tags| BM[Tantivy BM25 — in-RAM index]
    BM --> TD[TopDocs — limit × 5 if filtering]
    TD --> PF[Post-filter by collection / section]
    PF --> SG[SnippetGenerator — highlighted excerpts]
    SG --> SR["SearchResult[] { doc_index, score, excerpt }"]

    style Q fill:#4a9eff,color:#fff
    style SR fill:#10b981,color:#fff

The search index is built in RAM on startup. It contains all documents from all collections. Filtering by collection or section happens post-query because Tantivy’s STRING fields support exact match but not efficient pre-filtering in a single query. The 5× over-fetch compensates for post-filter reduction.

Tool Pattern

Each tool follows an identical structure:

#![allow(unused)]
fn main() {
// 1. Params struct — derives Deserialize + JsonSchema
#[derive(Deserialize, JsonSchema)]
pub struct MyParams { ... }

// 2. Router function — returns ToolRouter<KbMcpServer>
pub(crate) fn router() -> ToolRouter<KbMcpServer> {
    KbMcpServer::my_router()
}

// 3. Tool implementation — #[rmcp::tool] on an impl block
#[rmcp::tool_router(router = my_router)]
impl KbMcpServer {
    #[rmcp::tool(name = "my_tool", description = "...")]
    pub(crate) async fn my_tool(
        &self,
        Parameters(params): Parameters<MyParams>,
    ) -> Result<CallToolResult, rmcp::ErrorData> { ... }
}
}

Routers are composed in tools/mod.rs using the + operator:

#![allow(unused)]
fn main() {
sections::router() + documents::router() + search::router() + ...
}

Adding a tool = one new file in kb-mcp-server/src/tools/ + one line in mod.rs + one CLI subcommand in kb-cli/src/main.rs.

State Management

graph LR
    subgraph KbMcpServer
        IDX["index: Arc&lt;RwLock&lt;Index&gt;&gt;"]
        SE["search_engine: Arc&lt;SearchEngine&gt;"]
        COL["collections: Arc&lt;Vec&lt;...&gt;&gt;"]
    end

    R[search / get_document / kb_context] -->|read| IDX
    W[reindex / kb_write] -->|write| IDX
    W -->|rebuild| SE
    R -->|query| SE
    R -->|lookup path| COL

    style IDX fill:#f59e0b,color:#fff
    style SE fill:#4a9eff,color:#fff
    style COL fill:#10b981,color:#fff

Index behind RwLock for metadata reads (most tools) with exclusive writes during reindex/kb_write.
SearchEngine holds per-collection Memvid handles behind an internal Mutex. Search requires &mut self on Memvid even for reads.
get_document reads fresh from disk via server.rs::read_fresh() — the index is only used for path/title lookup, not content serving. Edits are visible immediately without reindex.
kb_write creates the file, syncs the collection’s .mv2, and rebuilds the in-memory Index.

Fresh-Read Design

get_document does not return content from the search index. It:

Looks up the document by path or title in the index
Finds the owning collection’s resolved path
Reads the file fresh from disk
Strips frontmatter and returns the body

This ensures content is never stale. The tradeoff is one filesystem read per get_document call, which is negligible for the expected workload.

Write Path

kb_write creates files in writable collections:

flowchart TD
    A[kb_write called] --> B{Collection exists?}
    B -->|no| ERR1[Error: collection not found]
    B -->|yes| C{Writable?}
    C -->|no| ERR2[Error: collection is read-only]
    C -->|yes| D[Generate filename: YYYY-MM-DD-kebab-title.md]
    D --> E{File exists?}
    E -->|yes| F[Append suffix: -2, -3, ...]
    E -->|no| G[Generate YAML frontmatter]
    F --> G
    G --> H[Write file to collection dir]
    H --> I[Rebuild search index]
    I --> J[Return path + metadata as JSON]

    style A fill:#4a9eff,color:#fff
    style J fill:#10b981,color:#fff
    style ERR1 fill:#ef4444,color:#fff
    style ERR2 fill:#ef4444,color:#fff

Persistent Storage (memvid-core)

Search is backed by memvid-core’s .mv2 persistent storage. Each collection gets its own .mv2 file at <cache_dir>/<hash>-<name>.mv2.

Startup: opens existing .mv2 files, diffs content hashes against a sidecar .hashes file, and only re-ingests changed documents
Smart chunking: memvid-core’s structural chunker segments long documents so queries match specific sections, not entire files
Crash-safe WAL: writes go through a write-ahead log inside the .mv2
Deduplication: search results are deduplicated by URI — one result per document, highest-scoring chunk wins

The Index (Vec) continues to handle metadata operations (exact path lookup, frontmatter retrieval, section counting). Memvid is the search layer only — this two-layer design keeps the architecture simple while gaining persistent, incremental search.

Hybrid Search (optional)

Enable with cargo build --features hybrid. Adds HNSW vector similarity alongside BM25 via memvid-core’s vec feature.

Ingest: LocalTextEmbedder (BGE-small-en-v1.5, 384 dims, local ONNX) generates embeddings at document ingest time via put_with_embedding()
Search: Memvid::ask(AskMode::Hybrid) runs BM25 + vector in parallel and fuses results via Reciprocal Rank Fusion (RRF, k=60)
Query-time: VecEmbedder adapter wraps the embedder for ask()
Feature-gated: All hybrid code behind #[cfg(feature = "hybrid")]. Default build stays BM25-only with no ONNX dependency.

The ONNX model (~34MB) must be present at ~/.cache/memvid/text-models/. In the container, it’s baked into the image at /opt/memvid/text-models/ and symlinked by the entrypoint.

Development Tooling & Methodology

This project doubles as a reference implementation for AI-assisted development — it’s both the tool and a practical example of using it. Every feature was built through structured AI workflows, and the documentation captures how.

Tools in Use

Claude Code

Claude Code is Anthropic’s CLI for Claude. It’s the primary development interface — all code, config, documentation, and vault content was authored through Claude Code sessions.

Key patterns used in this project:

Parallel agent dispatch — Spawning research agents to investigate APIs and codebase patterns simultaneously before writing code
MCP server integration — kb-mcp registered in .mcp.json gives Claude Code direct access to search and query the vault during development
Plan mode — Designing architecture before writing code, then executing with task tracking
Memory system — Persistent context across sessions for project decisions and user preferences

Compound Engineering

Compound Engineering is a Claude Code plugin that structures development into a repeating cycle where each unit of work makes subsequent work easier.

Philosophy: 80% planning and review, 20% execution. Prevention over remediation.

Workflow cycle:

/ce:brainstorm → /ce:plan → /ce:work → /ce:review → /ce:compound

Command	Purpose
`/ce:brainstorm`	Explore requirements, approaches, and feasibility before committing
`/ce:plan`	Transform concepts into detailed, executable implementation strategies
`/ce:work`	Execute plans with feature branches, worktrees, and task tracking
`/ce:review`	Multi-agent code evaluation — security, architecture, performance, simplicity
`/ce:compound`	Capture learnings into `docs/solutions/` so future work is faster

How we use it in this project:

/ce:brainstorm for exploring the memvid-core integration approach, container agent design, and hybrid search strategy
/ce:plan for translating brainstorms into phased implementation plans with acceptance criteria and open questions
/ce:work for executing plans with incremental commits and task tracking
Parallel research agents for investigating memvid-core’s API, ZeroClaw config format, and ONNX model delivery

The compounding part: Brainstorms and plans are preserved in docs/brainstorms/ and docs/plans/. Each document captures decisions, alternatives considered, and lessons learned — so future sessions start with context instead of rediscovering it.

kb-mcp (Dogfooding)

kb-mcp is both the product and a development tool. During Claude Code sessions, it’s registered as an MCP server in .mcp.json, giving the AI direct access to search the vault.

This means Claude Code can:

Search existing vault content before writing new docs (avoid duplicates)
Read document metadata via kb_context for token-efficient scanning
Verify that new content fits the vault structure
Check section coverage and identify gaps

This is the dogfooding principle — the same tool agents use in production is the tool we use during development. If it doesn’t work well for us, it won’t work well for anyone.

Development Workflow

The Brainstorm → Plan → Work Loop

Every significant feature follows this cycle:

1. Brainstorm (/ce:brainstorm)

Explore what to build through collaborative dialogue. Output is a brainstorm document in docs/brainstorms/ capturing decisions, rejected alternatives, and scope boundaries.

2. Plan (/ce:plan)

Transform the brainstorm into an implementation plan with:

Phased implementation steps
Acceptance criteria (checkboxes)
Open questions with defaults
API references from research

Output is a plan document in docs/plans/.

3. Work (/ce:work)

Execute the plan on a feature branch:

Create tasks from plan phases
Implement with incremental commits
Test continuously
Check off acceptance criteria as completed

4. Review (/ce:review)

Multi-agent code review examining security, architecture, performance, and simplicity. Used for complex or risky changes.

5. Compound (/ce:compound)

Capture what was learned into docs/solutions/ so the next time a similar problem arises, the solution is already documented.

Feature Branch Pattern

# Start from main
git checkout -b feat/feature-name

# Work with incremental commits
git commit -m "feat(scope): description"

# When done, squash merge back to main
git checkout main
git merge --squash feat/feature-name
git commit -m "feat: full description with Co-Authored-By"

# Clean up
git branch -D feat/feature-name
git push origin --delete feat/feature-name

Session Patterns

Starting a session:

# Claude Code has kb-mcp available via .mcp.json
# Search the vault to understand current state
kb-mcp list-sections
kb-mcp search --query "whatever you're working on"

Adding vault content:

Search existing content for gaps
Draft markdown with proper frontmatter (tags, created, updated, sources)
Write via kb_write or directly to the filesystem
Verify with kb-mcp search to confirm indexing

Research agent workflow:

# Build the researcher container
just agent-build

# Research a topic autonomously
just agent-research-topic "topic of interest"

# Review drafts on the host
ls vault/drafts/

# Promote approved drafts to vault sections
mv vault/drafts/good-entry.md vault/concepts/

Project History

This project was built in a single extended Claude Code session:

v2 Standalone Crate — Brainstormed generalizing the in-repo kb-mcp into a standalone project. Scaffolded in sandbox/, ported all 6 tools, added RON config, pushed to GitHub.
Persistent Storage — Replaced in-memory Tantivy with memvid-core .mv2 persistent files. Added incremental reindex via blake3 content hashing.
Containerized Researcher Agent — Built a ZeroClaw container with kb-mcp + DuckDuckGo web search. Agent writes research drafts to vault/drafts/ for human review.
Hybrid Search — Added opt-in BM25 + vector search via memvid-core vec feature. Local ONNX embeddings with RRF fusion.

Each phase followed the brainstorm → plan → work loop. All brainstorms and plans are in docs/brainstorms/ and docs/plans/.

Why This Matters

This project demonstrates that a single developer with Claude Code and structured workflows can build and maintain a complex system — Rust MCP server, persistent search engine, containerized agent, hybrid vector search, Obsidian vault, mdBook documentation — that would traditionally require a team and weeks of work.

The key enablers:

Structured planning — Compound Engineering’s brainstorm/plan/work cycle prevents the “just start coding” trap
Parallel research — Multiple agents investigate APIs, crate docs, and codebase patterns simultaneously
MCP integration — The knowledge base is queryable during development, not just at runtime
Dogfooding — Using the same tools in development that agents use in production catches design issues early
Knowledge compounding — Every brainstorm, plan, and solution is preserved so future sessions start with context

Adding Tools

Steps

Create crates/kb-mcp-server/src/tools/my_tool.rs
Add pub(crate) mod my_tool; to crates/kb-mcp-server/src/tools/mod.rs
Add + my_tool::router() to the combined_router() function
Add a CLI subcommand in crates/kb-cli/src/main.rs
If shared logic is needed, add it to crates/kb-core/src/
Update server instructions in crates/kb-mcp-server/src/server.rs

Tool Template

#![allow(unused)]
fn main() {
use rmcp::handler::server::wrapper::Parameters;
use rmcp::model::CallToolResult;
use schemars::JsonSchema;
use serde::Deserialize;

use crate::server::KbMcpServer;

#[derive(Debug, Deserialize, JsonSchema)]
pub struct MyToolParams {
    /// Description shown in tool schema
    pub query: String,
}

pub(crate) fn router() -> rmcp::handler::server::router::tool::ToolRouter<KbMcpServer> {
    KbMcpServer::my_tool_router()
}

#[rmcp::tool_router(router = my_tool_router)]
impl KbMcpServer {
    #[rmcp::tool(
        name = "my_tool",
        description = "What this tool does."
    )]
    pub(crate) async fn my_tool(
        &self,
        Parameters(params): Parameters<MyToolParams>,
    ) -> Result<CallToolResult, rmcp::ErrorData> {
        // Call kb_core functions for shared logic
        Ok(CallToolResult::success(vec![
            rmcp::model::Content::text("result"),
        ]))
    }
}
}

Key Points

Params struct must derive Deserialize + JsonSchema
Use #[schemars(description = "...")] for field descriptions in the tool schema
Use #[serde(default)] for optional fields
Return CallToolResult::error(...) with actionable messages for user-facing errors
Every tool must have a corresponding CLI subcommand for testing parity
Shared logic (filtering, formatting, utilities) belongs in kb-core, not in tool files

Researcher Agent

A containerized agent that uses kb-mcp to discover new content about AI agent memory and curate it into the vault.

Prerequisites

Docker + Docker Compose
Host Ollama (dev) or Anthropic/OpenAI API key (prod)
Web search uses DuckDuckGo — no API key needed

Setup

# 1. Copy provider config
cp agents/researcher/config/config.toml.ollama.example agents/researcher/config/config.toml

# 2. (Optional) Create .env for cloud LLM provider
echo "ANTHROPIC_API_KEY=sk-ant-..." > .env

# 3. Build the container
just agent-build

Usage

# Interactive research session
just agent-research

# Research a specific topic
just agent-research-topic "HNSW vector search performance"

# Check what the vault contains
just agent-vault-status

How It Works

Agent receives a research topic (manual prompt)
Searches existing vault via kb-mcp (avoids duplicates)
Searches the web via DuckDuckGo (Earl template)
Fetches and reads promising sources
Synthesizes a vault entry with proper frontmatter + source citations
Writes to the vault via kb_write
You review the new entry on the host and git commit

Container Security

Read-only root filesystem
Non-root user (uid 1001)
Named volume for runtime workspace
IDENTITY.md and SOUL.md mounted read-only
tmpfs for temp files (64MB cap)
Localhost-only ports
API keys via environment variables (not baked into image)

Agent Identity

IDENTITY.md — defines the research domain, available tools, and boundaries
SOUL.md — quality principles: primary sources first, cite everything, flag uncertainty

Resources

ZeroClaw Documentation
kb-mcp — the knowledge base MCP server this agent uses
ddg-web-search — DuckDuckGo search skill (inspiration)

Keyboard shortcuts

kb-mcp