Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

kb-mcp

Local, Rust-native MCP server for markdown knowledge bases.

AI agents work better when they have structured access to project knowledge. Markdown vaults — Obsidian, plain directories, skill libraries — are where this knowledge already lives. But agents can’t browse a vault. They need search, filtering, and token-efficient retrieval over content you’ve already written.

kb-mcp bridges that gap. Point it at your markdown directories, and it exposes them as MCP tools that any agent can query — no cloud services, no databases, no infrastructure.

Two Search Modes

kb-mcp ships with two search backends, chosen at build time:

BM25 keyword search (default) — Fast, lightweight, zero dependencies beyond the binary. Powered by Tantivy. Ideal for exact keyword queries like “PostgreSQL connection pool” or “rate limit error”.

Hybrid BM25 + vector search (--features hybrid) — Adds semantic similarity via a local ONNX embedding model (BGE-small-en-v1.5). Conceptual queries like “how do agents share state?” now match documents titled “Shared Memory” that keyword search alone would miss. Results are fused with Reciprocal Rank Fusion — keyword precision isn’t lost, it’s augmented. Still fully local, no API keys.

Where kb-mcp Fits

kb-mcp occupies a specific niche: local, zero-infrastructure knowledge base server for curated markdown. It’s not agent session memory, not a cloud service, not a code search tool.

Needkb-mcpAlternatives
Serve markdown docs to agents via MCPYes — BM25 or hybrid search, token-efficient briefings, write-backobsidian-web-mcp (remote HTTP, ripgrep)
Agent working memory (session state, preferences)No — use a dedicated memory systemhipocampus, hmem
Cloud-hosted memory serviceNo — fully local, single binarymengram
Semantic code searchNo — markdown onlymnemex

The memory-focused projects are complementary — an agent could use hmem for working memory and kb-mcp for its knowledge base. See the full Landscape survey for detailed comparisons.

Features

  • 6 MCP tools — list, search, get, context briefing, write, reindex
  • CLI parity — every MCP tool works as a CLI subcommand
  • RON configuration — typed, Rust-native config with comments
  • Collection model — multiple collections with sections, descriptions, and writable flags
  • Token-efficientkb_context returns frontmatter + summary without the full body
  • Write-backkb_write creates notes with proper frontmatter in writable collections
  • ~1,700 lines of Rust — small, auditable, single-binary

Built With Itself

kb-mcp ships with everything you need to try it immediately. The included collections.example.ron indexes the project’s own documentation and an AI agent memory vault — so cloning the repo and running kb-mcp list-sections gives you a working knowledge base out of the box.

This isn’t a demo afterthought. The project is both the tool and a practical example of using it. The vault documents AI agent memory patterns. The Researcher Agent uses kb-mcp to research topics and curate findings back into the vault. Every feature gets consumed by the project itself — if it doesn’t work well for us, it won’t work well for anyone.

Point an AI agent at this repo with kb-mcp as an MCP server and it can search the docs, read architecture decisions, and understand how the project works — using the very tool the project builds.

Quick Start

# Install both binaries (CLI `kb` + MCP server `kb-mcp`)
just install

# Or install individually
cargo install --path crates/kb-cli          # installs `kb`
cargo install --path crates/kb-mcp-server   # installs `kb-mcp`

# Create config
cp collections.example.ron collections.ron
# Edit paths to point at your markdown directories

# Use as CLI
kb list-sections
kb search --query "your query"

# Use as MCP server (register in .mcp.json — binary name is still `kb-mcp`)

Goals

Why kb-mcp Exists

AI agents work better when they have structured access to project knowledge. Markdown vaults — Obsidian, plain directories, skill libraries — are where this knowledge lives. But agents can’t browse a vault. They need search, filtering, and token-efficient retrieval.

kb-mcp bridges this gap: index markdown collections, expose them as MCP tools, let agents query and contribute to the knowledge base.

Design Principles

Project-agnostic. The binary knows nothing about any specific vault, project, or directory structure. Everything comes from collections.ron. One binary serves any project with markdown files.

Configuration as data. Section descriptions, collection paths, writable flags — all RON config. No recompilation to change what gets indexed or how it’s described.

Token-efficient by default. kb_context exists because agents shouldn’t read 50 documents to find the 3 that matter. Frontmatter + summary first, full content on demand.

CLI parity. Every MCP tool works as a CLI subcommand. Testing, scripting, and debugging don’t require an MCP client.

Fresh reads over cached content. get_document reads from disk, not the index. Edits are visible immediately. The index is for search and lookup — not content serving.

Simple until proven insufficient. Start with the simplest approach that works. Persistent storage, vector search, and incremental reindex come when the simple approach hits real limits.

Dogfood everything. This project is both the tool and a practical example of using it. The vault documents AI agent memory. The collections.example.ron indexes the project’s own docs. The container agent uses kb-mcp to research and curate the vault. Every feature we build, we also consume — if it doesn’t work well for us, it won’t work well for anyone.

Non-Goals

These reflect the project’s current focus, not permanent boundaries. As kb-mcp matures and usage patterns emerge, any of these could become goals in a future phase.

Not a general-purpose search engine. kb-mcp indexes markdown files with YAML frontmatter. It does not index code, logs, databases, or arbitrary file formats.

Not a document editor. kb_write creates new files. It does not edit, rename, move, or delete existing documents. Vault management stays in the editor (Obsidian, VS Code, etc.).

Not a vector database. Even with hybrid search planned, the primary interface is BM25 keyword search. Vector similarity augments it — it doesn’t replace it.

Not a multi-user system. kb-mcp serves one agent session at a time (stdio is 1:1). HTTP mode may support multiple clients but there’s no authentication, permissions, or multi-tenancy.

Not a replacement for agent memory. kb-mcp serves project knowledge (docs, guides, skills). Agent memory (preferences, session state, learned behaviors) belongs in the agent’s own memory system.

Not a web application. No UI, no dashboard, no browser interface. CLI + MCP is sufficient. Agents are the primary consumers.

See Also

Roadmap

What’s been built, what’s next, and when each phase makes sense.

Completed

Standalone MCP Server (v2)

RON-configured binary with 10 MCP tools, full CLI parity, and zero hardcoded project-specific values. Open sourced at github.com/ttdonovan/kb-mcp.

Persistent Storage (memvid-core)

Replaced in-memory Tantivy with memvid-core .mv2 persistent files. Incremental reindex via blake3 content hashing. Smart markdown chunking. Crash-safe WAL.

Containerized Researcher Agent

ZeroClaw container with kb-mcp + DuckDuckGo web search. Writes research findings to vault/drafts/ for human review. IDENTITY.md/SOUL.md define agent personality and quality standards.

Hybrid Search (kb-mcp Phase 2)

BM25 + vector search via memvid-core vec feature. Local ONNX embeddings (BGE-small-en-v1.5), HNSW vector index in .mv2 files, RRF fusion. Opt-in via cargo build --features hybrid. Container supports just agent-build-hybrid.

Vault Intelligence Bundle

Three new MCP tools (kb_digest, kb_query, kb_export) plus transparent auto-reindex via directory mtime checks. Brings the tool count from 6 to 9. kb_write also gained optional directory and filename parameters for hierarchical collection structures.

Cargo Workspace Split

Reorganized from a single binary crate into a three-crate workspace: kb-core (shared library), kb-cli (binary kb), kb-mcp-server (binary kb-mcp). Eliminates CLI/MCP code duplication, guarantees behavioral parity through shared kb_core::format::* functions, and enables independent compilation with clean dependency isolation.

Up Next

Draft Reviewer Agent

A second container agent (or Claude Code sub-agent) that reviews drafts for quality, formatting consistency, source verification, and proper frontmatter — then promotes approved entries into the vault.

Why now: The researcher agent is producing drafts. Manual review is the bottleneck. Automating the quality gate completes the capture pipeline.

Scope:

  • Read drafts collection, check against vault conventions (SOUL.md standards)
  • Verify sources are reachable URLs
  • Ensure frontmatter has required fields (tags, created, updated, sources, target)
  • Promote approved drafts to the correct vault section
  • Flag issues for human attention rather than silently fixing

Heartbeat Scheduling

Add HEARTBEAT.md to the researcher agent for automated periodic research.

Why now: The researcher works well manually. Scheduling is a small addition that makes it run on autopilot for configured topics.

Scope:

  • HEARTBEAT.md defines research topics and frequency
  • ZeroClaw cron runs the research workflow on schedule
  • Digest report of what was added (file or notification)

Future

Knowledge Capture Tools (kb-mcp Phase 3)

Specialized write tools beyond free-form kb_write:

  • kb_capture_session — record debugging/coding sessions
  • kb_capture_fix — record bug fixes with symptom/cause/resolution
  • kb_classify — auto-tag unprocessed notes (type, tags, summary)

When: When agents are actively writing and would benefit from structured capture templates.

Gap Analyzer Agent

Reads the existing vault, identifies thin or missing topics, and feeds the researcher agent with specific research requests. The inward-facing complement to the outward-facing researcher.

When: When the vault is large enough that gaps aren’t obvious from browsing.

Knowledge Keeper Agent

Combines researcher + reviewer + gap analyzer into the full Knowledge Keeper pattern. Sweeps sessions, scores knowledge by usefulness, prunes stale entries. The most autonomous form of vault curation.

When: When all three component agents are proven individually.

HTTP Daemon Mode (kb-mcp Phase 4)

Add HTTP transport alongside stdio. Long-lived server eliminates MCP cold starts and enables network-based access (no volume mount needed for container agents).

When: When cold start latency is a problem (likely after hybrid search adds ONNX model loading).

Cross-Agent Knowledge Sharing (kb-mcp Phase 5)

Multiple projects share knowledge through federated .mv2 files. Agents in different repos contribute to and query from shared collections.

When: When multiple projects are actively using kb-mcp and would benefit from shared context.

Configuration

kb-mcp is configured via a collections.ron file that defines what markdown directories to index.

Config File

(
    // Optional: override cache directory (default: ~/.cache/kb-mcp)
    // cache_dir: "~/.cache/kb-mcp",
    collections: [
        (
            name: "docs",
            path: "docs",
            description: "Project documentation",
            writable: false,
            sections: [
                (prefix: "guides", description: "How-to guides"),
                (prefix: "reference", description: "API reference"),
            ],
        ),
        (
            name: "notes",
            path: "notes",
            description: "Working notes",
            writable: true,
            sections: [],
        ),
    ],
)

Fields

FieldTypeRequiredDescription
cache_dirStringNoCache directory for index files (default: ~/.cache/kb-mcp)
collectionsListYesOne or more collection definitions

Collection Fields

FieldTypeRequiredDescription
nameStringYesUnique identifier for the collection
pathStringYesDirectory path (relative to config file)
descriptionStringYesHuman-readable description
writableBoolNoAllow kb_write to create files (default: false)
sectionsListNoSection definitions for this collection

Section Fields

FieldTypeRequiredDescription
prefixStringYesDirectory prefix that identifies this section
descriptionStringYesHuman-readable description

Resolution Order

kb-mcp searches for configuration in this order:

  1. --config <path> CLI flag (explicit)
  2. KB_MCP_CONFIG environment variable
  3. ./collections.ron (current working directory)
  4. ~/.config/kb-mcp/collections.ron (user default)

Collection paths resolve relative to the config file’s parent directory.

Cross-Project Use

Install kb-mcp globally, then point other projects at a specific config:

{
  "mcpServers": {
    "kb": {
      "command": "kb-mcp",
      "env": {
        "KB_MCP_CONFIG": "/path/to/project/collections.ron"
      },
      "args": []
    }
  }
}

MCP Server

kb-mcp runs as an MCP server when invoked with no arguments. It communicates over stdio using the JSON-RPC protocol.

Registration

Add to your project’s .mcp.json:

{
  "mcpServers": {
    "kb": {
      "command": "kb-mcp",
      "args": []
    }
  }
}

The binary must be in your PATH (install via just install-server or cargo install --path crates/kb-mcp-server).

Startup

On startup, kb-mcp:

  1. Loads collections.ron (see Configuration)
  2. Scans all collection directories for .md files
  3. Builds the BM25 search index in memory
  4. Starts the MCP stdio transport

Logs go to stderr. Startup typically takes <1s for ~200 documents.

Agent Workflow

A typical agent session:

  1. Call list_sections to see what’s available
  2. Call search to find relevant documents
  3. Call kb_context on promising results to scan frontmatter + summary
  4. Call get_document only on documents worth reading in full
  5. Call kb_write to capture new knowledge (writable collections only)
  6. Call reindex after creating new files mid-session

CLI

The kb binary provides CLI access to all MCP tools. Every MCP tool has a CLI equivalent.

Commands

# List all sections
kb list-sections

# Search
kb search --query "rate limits"
kb search --query "bevy" --collection skills
kb search --query "agents" --scope runtimes
kb search --query "MCP" --max-results 5

# Get full document
kb get-document --path "concepts/mcp-server-pattern.md"

# Token-efficient briefing
kb context --path "concepts/mcp-server-pattern.md"

# Write a note (writable collection only)
kb write --collection notes --title "My Note" --body "Content" --tags "tag1,tag2"

# Rebuild index
kb reindex

Global Options

FlagDescription
--config <path>Path to collections.ron config file

Output

All commands output JSON to stdout. Errors go to stderr.

Hybrid Search

By default, kb-mcp uses BM25 keyword search via Tantivy. Enable the hybrid feature to add vector similarity search alongside BM25.

Why Hybrid?

BM25 is excellent for exact keyword queries (“PostgreSQL connection pool”, “rate limit error”). But it misses conceptual matches:

QueryBM25 aloneHybrid (BM25 + vector)
“how do agents share state?”Misses “Shared Memory” docMatches via semantic similarity
“memory architecture”Finds docs with those exact wordsAlso finds “Cognitive Memory Model”
“BM25 ranking”Works perfectlyWorks perfectly (BM25 still contributes)

Hybrid search combines both signals using Reciprocal Rank Fusion (RRF), so keyword precision isn’t lost — it’s augmented.

How It Works

  1. At ingest time: Each document is embedded into a 384-dimensional vector using BGE-small-en-v1.5 (local ONNX, no cloud API)
  2. At query time: The query is embedded, then both BM25 and HNSW vector search run in parallel
  3. Fusion: Results are merged via RRF (score = Σ 1/(k + rank)) with k=60, combining keyword and semantic signals

The vector index is stored inside the .mv2 file alongside the Tantivy BM25 index — no separate database or service.

Setup

1. Install with hybrid feature

cargo install --path . --features hybrid

This pulls in ONNX Runtime and the HNSW library. Build time is longer than the default BM25-only build.

2. Download the embedding model

The BGE-small-en-v1.5 model files (~34MB total) must be present locally:

mkdir -p ~/.cache/memvid/text-models

curl -L -o ~/.cache/memvid/text-models/bge-small-en-v1.5.onnx \
  https://huggingface.co/BAAI/bge-small-en-v1.5/resolve/main/onnx/model.onnx

curl -L -o ~/.cache/memvid/text-models/bge-small-en-v1.5_tokenizer.json \
  https://huggingface.co/BAAI/bge-small-en-v1.5/resolve/main/tokenizer.json

3. Re-index to build vector index

Existing .mv2 files only have BM25 data. Delete the cache to force a full re-ingest with embeddings:

rm -rf ~/.cache/kb-mcp/   # or ~/Library/Caches/kb-mcp/ on macOS
kb-mcp list-sections      # triggers re-ingest with vectors

Usage

No changes to your queries. The search tool automatically uses hybrid mode when compiled with the feature. Agents get better results without knowing about search modes.

# These work the same — but with hybrid, conceptual matches are found
kb-mcp search --query "how do agents share state?"
kb-mcp search --query "memory architecture" --collection vault

Default Build (No Hybrid)

If you don’t need vector search, the default build stays lightweight:

cargo install --path .   # BM25 only, no ONNX dependency

The same search queries work — they just use BM25 keyword matching without the vector similarity signal.

Technical Details

  • Model: BGE-small-en-v1.5 (BAAI), 384 dimensions
  • Index: HNSW graph (brute-force below 1000 vectors)
  • Distance: L2 (Euclidean)
  • Fusion: Reciprocal Rank Fusion, k=60
  • Storage: Vectors stored inside .mv2 files alongside Tantivy
  • Embedding cache: LRU, 1000 entries, auto-unloads after 5min idle

Researcher

Tools

list_sections

List all collections and their sections with document counts and descriptions.

Parameters: None

Returns: JSON array of sections with name, description, doc_count, and collection.

Full-text search across the knowledge base using BM25 ranking.

Parameters:

NameTypeRequiredDescription
queryStringYesSearch query (supports phrases and boolean operators)
collectionStringNoFilter by collection name
scopeStringNoFilter by section prefix
max_resultsNumberNoMaximum results (default: 10)

Returns: JSON with query, total, and results array. Each result has path, title, section, collection, score, and excerpt.

get_document

Retrieve a document by path or title. Content is read fresh from disk.

Parameters:

NameTypeRequiredDescription
pathStringYesDocument path or title

Returns: JSON with path, title, tags, section, collection, and content.

kb_context

Token-efficient document briefing. Returns frontmatter metadata and first paragraph summary without the full body.

Call this to survey relevance before using get_document for full content. Saves 90%+ tokens on retrieval-heavy workflows.

Parameters:

NameTypeRequiredDescription
pathStringYesDocument path or title

Returns: JSON with path, title, tags, section, collection, frontmatter (all fields), and summary (first paragraph).

kb_write

Create a new document in a writable collection. Generates frontmatter with a date-prefixed filename by default. Use directory to write into subdirectories and filename to specify an exact name without date prefix.

Parameters:

NameTypeRequiredDescription
collectionStringYesTarget collection (must be writable)
titleStringYesDocument title
bodyStringYesDocument body (markdown)
tagsListNoTags for frontmatter
statusStringNoStatus field for frontmatter
sourceStringNoSource field for frontmatter
directoryStringNoSubdirectory within collection (e.g. “concepts/memory”). Created automatically.
filenameStringNoExact filename (e.g. “cognitive-memory-model.md”). Skips date prefix when provided.

Returns: JSON with path, collection, title, and tags.

Errors: Returns actionable error if collection is read-only, not found, or directory escapes the collection root.

kb_digest

Vault summary — shows collections, sections with topics, recent additions (last 7 days), and thin sections (fewer than 2 documents). Use this to understand what the knowledge base covers before searching.

Parameters:

NameTypeRequiredDescription
collectionStringNoFilter to a specific collection

Returns: JSON with total_documents, total_sections, and collections array. Each collection has name, doc_count, sections (with topics and gap hints), and recent additions.

kb_query

Filter documents by frontmatter fields. Multiple filters combine with AND logic. Returns document metadata without body content.

Parameters:

NameTypeRequiredDescription
tagStringNoFilter by tag
statusStringNoFilter by frontmatter status field
created_afterStringNoYYYY-MM-DD, returns docs created on or after
collectionStringNoFilter by collection name
has_sourcesBooleanNoOnly docs with a sources field

Returns: JSON with total and documents array. Each document has path, title, tags, section, and collection.

kb_export

Export vault as a single markdown document. Concatenates documents with frontmatter headers, limited to max_documents to prevent unbounded output.

Parameters:

NameTypeRequiredDescription
collectionStringNoCollection to export (default: all)
max_documentsNumberNoMaximum documents to include (default: 200)

Returns: Concatenated markdown with document separators and frontmatter metadata. Appends a truncation notice if the limit is hit.

kb_health

Vault health diagnostics — checks document quality across collections. Flags missing frontmatter dates, untagged docs, stale content, stub documents, orphaned notes (no inbound wiki-links), and broken wiki-links. Use kb_digest for coverage overview, kb_health for quality issues.

Parameters:

NameTypeRequiredDescription
collectionStringNoFilter to a specific collection
stale_daysNumberNoDays threshold for staleness (default: 90)
min_wordsNumberNoMinimum word count for stub detection (default: 50)

Returns: JSON with total_documents_checked, total_issues, and per-collection arrays for each check: missing_created, missing_updated, no_tags, stale, stubs, orphans, broken_links.

reindex

Rebuild the search index from all collections on disk. Use after editing documents mid-session. Note: search now auto-detects new files via directory mtime checks, so reindex is mainly needed after in-place content edits.

Parameters: None

Returns: Summary message with document and section counts.

RON Schema

The full collections.ron schema with all fields:

(
    // Cache directory for index files
    // Default: ~/.cache/kb-mcp
    // Supports ~ expansion
    cache_dir: "~/.cache/kb-mcp",

    collections: [
        (
            // Unique name — used in search filters and kb_write target
            name: "vault",

            // Path to markdown directory
            // Relative to this config file's location
            path: "ai-vault",

            // Description shown in list_sections output
            description: "Primary knowledge vault",

            // Allow kb_write to create files here
            // Default: false
            writable: false,

            // Section definitions — map directory prefixes to descriptions
            // Documents in subdirectories matching a prefix get that section's description
            // Sections without definitions still appear, just without descriptions
            sections: [
                (prefix: "concepts", description: "Cross-cutting concepts"),
                (prefix: "guides", description: "How-to guides"),
            ],
        ),
    ],
)

Rust Types

The RON file deserializes into these Rust types:

#![allow(unused)]
fn main() {
struct Config {
    cache_dir: Option<String>,
    collections: Vec<Collection>,
}

struct Collection {
    name: String,
    path: String,
    description: String,
    writable: bool,       // default: false
    sections: Vec<Section>, // default: []
}

struct Section {
    prefix: String,
    description: String,
}
}

Landscape

A survey of related projects in the AI agent memory and knowledge management space. Understanding what exists helps identify where kb-mcp fits, what patterns to adopt, and where to differentiate.

Quick Comparison

ProjectTypeLanguageSearchMCPWriteLocal/Cloud
kb-mcpKnowledge base serverRustBM25 + vector hybridstdioYesLocal
hipocampusAgent memory harnessJavaScriptBM25 + vector (qmd)NoYesLocal
obsidian-web-mcpRemote vault accessPythonripgrep full-textHTTPYesRemote
mengramCloud memory servicePython + JSSemantic (cloud)YesYesCloud
hmemHierarchical memoryTypeScriptTree traversalstdioYesLocal
mnemexSemantic code searchTypeScriptBM25 + vector (LanceDB)stdioNoLocal
cqAgent knowledge commonsPython + TSFTS5 + domain tagsstdioYesLocal + Team API
prism-mcpAgent session memoryTypeScriptFTS5 + sqlite-vec + TurboQuantstdioYesLocal or Cloud

Source Code Metrics

Measured with tokei. Source-only — excludes eval/benchmark data, web frontends, generated files, and node_modules.

ProjectLanguageFilesCode LinesCommentsTotal Lines
kb-mcpRust171,696362,001
hipocampusJavaScript373096954
obsidian-web-mcpPython181,469431,895
mengramPython (core)4822,0971,00725,907
hmemTypeScript106,5695777,617
mnemexTypeScript + TSX (src/)38886,91920,113120,031
cqPython + TypeScript8110,1291,66714,378
prism-mcpTypeScript + SQL12128,3678,58740,977

Notes: mengram includes cloud backend, SDKs, and integrations in its Python source. mnemex src/ includes the core engine, CLI, and MCP server — the full repo (329K lines) also contains eval benchmarks (91K), AI docs (15K), landing page, and VS Code extension.

Documentation Metrics

Markdown files as a proxy for documentation investment.

Project.md FilesLinesDoc-to-Code Ratio
kb-mcp524,8172.8x
hipocampus161,6572.3x
obsidian-web-mcp11980.1x
mengram171,8710.1x
hmem112,7100.4x
mnemex7423,8080.3x
cq122,2800.2x
prism-mcp91,4660.05x

Observations:

  • kb-mcp has the highest doc-to-code ratio (2.8x) — more lines of documentation than source code. This reflects the project’s dual role as both a tool and a documented reference implementation.
  • hipocampus also documents heavily (2.3x) — its memory architecture and protocol are extensively explained in markdown.
  • obsidian-web-mcp has minimal docs (single README) despite a substantial codebase — the code is the documentation.
  • mnemex has extensive docs (74 files, 24K lines) but the ratio is low because the codebase is so large.

Where kb-mcp Fits

kb-mcp occupies a specific niche: local, Rust-native, zero-infrastructure knowledge base server for curated markdown. It’s not trying to be agent session memory (hipocampus, hmem), a cloud service (mengram), or a code search tool (mnemex).

The closest overlap is with obsidian-web-mcp (both serve markdown vaults via MCP), but they differ on transport (local stdio vs remote HTTP) and search quality (BM25-ranked vs ripgrep grep).

The memory-focused projects (hipocampus, mengram, hmem, prism-mcp) are complementary rather than competitive — they manage agent session memory, while kb-mcp serves reference knowledge. An agent could use hmem for working memory and kb-mcp for its knowledge base.

cq represents a third category: collective agent learning. Where kb-mcp serves curated human-authored knowledge and memory projects persist agent state, cq captures wisdom that emerges from agent sessions and shares it across agents. All three categories coexist naturally.

hipocampus

Source: github.com/kevin-hs-sohn/hipocampus Language: JavaScript | Status: Active (26 stars)

Drop-in memory harness for AI agents with a 3-tier memory architecture and 5-level compaction tree.

What It Does

hipocampus manages agent session memory over time. Hot memory (~500 lines) is always loaded. Warm memory (daily logs, knowledge base, plans) is read on demand. Cold memory is searched via qmd hybrid search.

The key innovation is the 5-level compaction tree: raw daily logs get compressed into daily → weekly → monthly → root summaries via LLM-driven summarization. A ROOT.md topic index (~100 lines) gives agents O(1) awareness of what they know.

Key Features

  • 3-tier memory: Hot (always loaded), Warm (on-demand), Cold (search)
  • 5-level compaction tree with LLM-driven summarization
  • ROOT.md topic index for constant-time knowledge awareness
  • Hybrid search via qmd (BM25 + vector)
  • Claude Code plugin marketplace integration
  • Pre-compaction hooks for automatic memory preservation
  • File-based, no database

Comparison to kb-mcp

Aspecthipocampuskb-mcp
Primary useAgent session memoryCurated knowledge base
Data modelDaily logs → compacted summariesMarkdown collections indexed for search
Searchqmd (BM25 + vector)memvid-core (BM25 + optional vector)
Write patternContinuous (daily logs, auto-compaction)On-demand (kb_write, manual curation)
MCP supportNo (skill-based)Yes (stdio transport)

Relationship: Complementary. hipocampus handles what the agent remembers from sessions; kb-mcp serves what the agent looks up in reference material.

Patterns Worth Adopting

  • Compaction tree — the 5-level summarization pattern is relevant for kb-mcp’s future Knowledge Keeper agent
  • ROOT.md topic index — a constant-cost “what do I know?” summary could complement list_sections

obsidian-web-mcp

Source: github.com/jimprosser/obsidian-web-mcp Language: Python | Status: Active (71 stars)

Secure remote MCP server for Obsidian vaults with OAuth 2.0 auth, Cloudflare Tunnel, and atomic writes safe for Obsidian Sync.

What It Does

obsidian-web-mcp makes your Obsidian vault accessible from anywhere — Claude web, mobile, and desktop — via an HTTP MCP endpoint proxied through Cloudflare Tunnel with OAuth 2.0 PKCE authentication.

Key Features

  • 9 MCP tools: read, batch read, write, frontmatter update, search, frontmatter search, list, move, soft-delete
  • Remote access via Cloudflare Tunnel + OAuth 2.0 PKCE
  • Atomic writes (write-to-temp-then-rename) safe for Obsidian Sync
  • In-memory frontmatter index with filesystem watcher for auto-updates
  • ripgrep for full-text search (falls back to Python if unavailable)
  • Path traversal protection, safety limits (1MB/file, 20 files/batch)
  • launchd plist for macOS always-on deployment

Comparison to kb-mcp

Aspectobsidian-web-mcpkb-mcp
TransportHTTP (remote via Cloudflare)stdio (local)
Searchripgrep grep (no ranking)BM25 ranked + optional vector
Vault operationsRich (batch read, move, delete, frontmatter)Focused (search, read, write, context)
AuthOAuth 2.0 PKCENone (local only)
Obsidian-specificYes (Sync-safe, .trash, frontmatter index)No (any markdown)
Token efficiencyNo equivalentkb_context (frontmatter + summary)

Relationship: Different problem domains. obsidian-web-mcp solves remote vault access; kb-mcp solves effective knowledge search. Both serve Obsidian vaults but from opposite directions.

Patterns Worth Adopting

  • Filesystem watcher — auto-reindex when files change (instead of manual reindex calls)
  • Frontmatter index — in-memory YAML index for structured queries beyond full-text search
  • HTTP transport — relevant for kb-mcp’s Phase 4 HTTP daemon mode

mengram

Source: github.com/alibaizhanov/mengram Language: Python + JS SDKs | Status: Active (112 stars)

Human-like memory for AI agents with semantic, episodic, and procedural memory types — procedures evolve from failures.

What It Does

mengram is a cloud-hosted memory service that gives AI agents persistent, personalized memory across sessions. Its key differentiator is procedural memory — workflows that automatically evolve when they fail, creating an improvement loop.

Key Features

  • 3 memory types: Semantic (facts), Episodic (events), Procedural (evolving workflows)
  • Cognitive Profile — persistent user profile loaded at session start
  • Claude Code hooks: auto-save after responses, auto-recall on prompts
  • File upload (PDF, DOCX, TXT, MD) with vision AI extraction
  • Knowledge graph
  • Multi-user isolation
  • Import from ChatGPT / Obsidian
  • Python + JavaScript SDKs, REST API
  • LangChain, CrewAI, MCP integrations
  • Free tier available

Comparison to kb-mcp

Aspectmengramkb-mcp
HostingCloud (mengram.io)Local (your machine)
Data controlThird-party cloudOn-disk, fully private
Memory model3-tier cognitive (semantic/episodic/procedural)Document collections with sections
SearchSemantic (cloud API)BM25 + optional local vector
Auto-captureYes (Claude Code hooks)No (manual or agent-driven)
DependenciesAPI key + networkZero (Rust binary)

Relationship: Different trust and deployment models. mengram is convenient (auto-save, cloud sync, SDKs) but sends your data to a third party. kb-mcp keeps everything local and private.

Patterns Worth Adopting

  • Procedural memory — workflows that evolve from failure analysis. Relevant to the vault’s Knowledge Keeper pattern.
  • Cognitive Profile — a structured “who is the user” document. Claude Code’s memory system does something similar.
  • Auto-save hooks — capturing knowledge without manual intervention. The researcher agent’s heartbeat scheduling aims at this.

hmem

Source: github.com/Bumblebiber/hmem Language: TypeScript | Status: Active (9 stars)

MCP server with 5-level lazy-loaded SQLite memory modeled after human memory hierarchy — agents load only the detail level they need.

What It Does

hmem stores agent memories in a hierarchical tree with 5 levels of detail. Level 1 is a coarse summary (always loaded on agent spawn). Levels 2-5 provide progressively more detail, fetched on demand. This saves tokens by giving agents awareness without loading everything.

Key Features

  • 5-level hierarchical memory (coarse → verbatim)
  • Tree structure with compound IDs (e.g., L0003.2.1)
  • Markers: favorite, pinned, obsolete, irrelevant, active, secret
  • Obsolete entries hidden from bulk reads but remain searchable
  • Session cache with Fibonacci decay (suppresses already-seen entries)
  • Access-count promotion (most-accessed entries auto-expand)
  • Import/export as Markdown or SQLite
  • Per-agent memory files (.hmem)
  • Curator agent concept for periodic maintenance
  • MCP over stdio (Claude Code, Gemini CLI, Cursor, Windsurf, OpenCode)

Comparison to kb-mcp

Aspecthmemkb-mcp
Data modelHierarchical tree in SQLiteFlat markdown collections
SearchTree traversal by ID (no ranking)BM25 ranked + optional vector
Token efficiency5 detail levels, load only what’s neededkb_context (frontmatter + summary)
StorageSQLite per agentmemvid-core .mv2 per collection
Write patternwrite/update/append memorieskb_write creates markdown files
MaintenanceCurator agent, Fibonacci decay, access promotionManual or researcher agent

Relationship: Complementary. hmem excels at structured agent working memory (what am I doing, what did I decide). kb-mcp excels at reference knowledge search (what does the documentation say about X).

Patterns Worth Adopting

  • Lazy-loaded detail levels — the 5-level hierarchy is a powerful token-saving pattern. kb_context is a 2-level version of this (summary vs full document).
  • Obsolete-but-searchable — marking entries as outdated without deleting them. Useful for vault knowledge that may be superseded.
  • Access-count promotion — frequently accessed documents could be surfaced more prominently in search results.
  • Fibonacci decay — suppressing recently-seen results in repeated queries to surface new content.

mnemex

Source: github.com/MadAppGang/mnemex Language: TypeScript | Status: Active (35 stars)

Local semantic code search for Claude Code — tree-sitter parsing, embedding-based vector search + BM25 hybrid, stored in LanceDB.

What It Does

mnemex indexes codebases using tree-sitter to understand code structure (functions, classes, methods), embeds chunks via configurable providers (OpenRouter, Ollama, custom), and serves hybrid BM25 + vector search over MCP. It’s code search, not document search.

Key Features

  • Hybrid search: BM25 + vector similarity via LanceDB
  • Tree-sitter code parsing (structure-aware, not naive line splits)
  • Embedding flexibility: OpenRouter (cloud), Ollama (local), custom
  • Embedding model benchmarking tool with NDCG scores
  • Auto-reindex on search (detects modified files)
  • Symbol graph with PageRank for importance ranking
  • mnemex pack — export codebase to a single AI-friendly file
  • 4 MCP tools: search_code, index_codebase, get_status, clear_index
  • Claude Code plugin, OpenCode plugin, VS Code autocomplete

Comparison to kb-mcp

Aspectmnemexkb-mcp
DomainCode searchDocument/knowledge search
ParsingTree-sitter (code-aware chunks)Markdown (frontmatter + heading structure)
Hybrid searchBM25 + vector (LanceDB)BM25 + vector (memvid-core)
EmbeddingsOpenRouter / Ollama / customLocal ONNX (BGE-small-en-v1.5)
Write-backNo (read-only)Yes (kb_write)
Auto-reindexYes (on search)No (manual reindex or startup sync)
Unique featureSymbol graph + PageRankToken-efficient kb_context

Relationship: Different domains (code vs knowledge). Both use hybrid BM25 + vector search but with different backends and parsing strategies.

Patterns Worth Adopting

  • Auto-reindex on search — detect file changes at query time instead of requiring explicit reindex calls. Low overhead for small vaults.
  • Embedding model benchmarking — a tool to evaluate search quality with different models and parameters.
  • Pack/export — exporting the full knowledge base as a single context-friendly file for LLM ingestion.

Ori-Mnemos

Source: github.com/aayoawoyemi/Ori-Mnemos Language: TypeScript | Status: Active (62 stars)

Persistent cognitive memory system for AI agents — knowledge graph with learning layers, identity resources, and adaptive retrieval.

What It Does

Ori-Mnemos treats agent memory as a learning problem, not a lookup problem. It builds a knowledge graph from markdown files (wiki-links + learned co-occurrence edges), runs 4-signal retrieval (semantic + BM25 + PageRank + warmth), and continuously improves via three learning layers that reshape the graph with every interaction.

Agents get persistent identity (goals, methodology, reminders) and a memory system that decays, reinforces, and prunes like biological memory. All local — markdown + SQLite, no cloud dependencies.

Key Features

  • 4-signal RRF fusion: semantic embeddings, BM25, Personalized PageRank, associative warmth
  • 3 learning layers: Q-value reranking, co-occurrence edge learning (Hebbian/NPMI), stage meta-learning (LinUCB)
  • Knowledge graph: wiki-links + learned co-occurrence edges with homeostasis normalization
  • 3 memory zones: identity (slow decay), knowledge (1x), operations (fast decay)
  • Agent identity resources: personality, goals, methodology, daily context, reminders
  • 16 MCP tools + 5 identity resources + 16 CLI commands
  • Local embeddings: all-MiniLM-L6-v2 via Hugging Face transformers
  • Storage: markdown files + SQLite (indexes and learning state)
  • 579+ tests (vitest)

Notable Tools

ToolPurpose
ori_orientDaily briefing — status, reminders, goals, vault health
ori_query_rankedFull retrieval with Q-value reranking + stage meta-learning
ori_exploreRecursive graph exploration with sub-question decomposition
ori_warmthAssociative field showing resonant notes in context
ori_promoteGraduate inbox notes to typed notes with classification
ori_query_fadingLow-vitality candidates for archival

Comparison to kb-mcp

AspectOri-Mnemoskb-mcp
Primary useAgentic memory with learningCurated knowledge base retrieval
LanguageTypeScriptRust
StorageMarkdown + SQLiteMarkdown + memvid-core .mv2
Search4-signal RRF (semantic + BM25 + PageRank + warmth)BM25 + optional vector (memvid-core)
Learning3 layers (Q-value, co-occurrence, stage meta)None — static index
GraphWiki-links + learned edges + PageRankSection hierarchy only
IdentityGoals, methodology, reminders, daily contextNot supported
Decay/vitality3 memory zones with configurable decay ratesNone
Tools16 MCP + 5 resources9 MCP tools
CLI parityYes (dual-mode)Yes (dual-mode)
Auto-reindexIncremental embedding updatesDirectory mtime detection
Recall@590% (HotpotQA multi-hop)Baseline BM25
Latency~120ms (full intelligence)Sub-100ms (BM25)

Relationship: Ori is a superset in ambition — it does everything kb-mcp does (markdown indexing, search, MCP tools) plus graph-based learning, identity management, and adaptive retrieval. kb-mcp is simpler, faster, and Rust-native. They solve overlapping but different problems: kb-mcp is a library you search; Ori is a brain that learns.

Patterns Worth Adopting

  • Q-value learning on retrieval — tracking which search results agents actually use (forward citations, re-recalls, dead-ends) to improve future ranking. Could inform a future kb-mcp relevance layer.
  • Co-occurrence edges — notes retrieved together form stronger associations. Lightweight to implement on top of existing search.
  • Identity resourcesori://identity, ori://goals etc. give agents persistent context. kb-mcp’s kb_context serves a similar purpose but without the identity layer.
  • Memory zones with decay — different decay rates for identity vs operational knowledge. Relevant for kb-mcp’s future Knowledge Keeper.
  • Stage meta-learning — learning to skip expensive retrieval stages when they don’t contribute value. Relevant if kb-mcp adds more retrieval signals beyond BM25.
  • Vault health diagnosticsori_health checks index freshness, orphan notes, dangling links. Could complement kb-mcp’s kb_digest.

cq

Source: github.com/mozilla-ai/cq Language: Python + TypeScript | Status: v0.4.0, Active (Mozilla AI)

Shared knowledge commons for AI agents — collective learning so agents stop independently rediscovering the same failures. Built by Mozilla AI.

What It Does

cq captures “knowledge units” (KUs) that emerge from agent sessions and makes them queryable by other agents. Agents propose insights, confirm what works, flag what’s wrong, and reflect at session end. Knowledge graduates from local (private) to team (org-shared, human-reviewed) to global (public commons — not yet implemented).

The core thesis: agents worldwide burn tokens rediscovering the same failures. A shared commons eliminates redundant learning.

Key Features

  • 6 MCP tools: query, propose, confirm, flag, reflect, status
  • SQLite + FTS5 local store with domain tag Jaccard similarity scoring
  • Confidence scoring: confirmations boost (+0.1), flags penalize (-0.15)
  • Tiered graduation: local → team (human-in-the-loop review) → global
  • Team API (FastAPI) with React review dashboard
  • Post-error hook auto-queries commons before agent retries
  • Session-end reflect mines conversations for shareable insights
  • Claude Code plugin (SKILL.md behavioral protocol + hooks.json)
  • 69KB proposal document covering trust layers, DID identity, ZK proofs

Comparison to kb-mcp

Aspectcqkb-mcp
DomainAgent collective knowledgeCurated document search
Content sourceAgent-generated (propose/confirm/flag)Human-authored markdown
SearchFTS5 + domain tag JaccardBM25 (Tantivy)
Write modelPropose → confirm/flag loopkb_write to writable collections
StorageSQLite (local) + team API (cloud)Tantivy index + disk reads
Unique featureConfidence scoring via peer confirmationToken-efficient kb_context
Tool count610

Relationship: Complementary. kb-mcp serves curated reference knowledge (“what does our API spec say?”); cq serves collective agent wisdom (“what gotchas have agents hit with this API?”). They’d coexist naturally.

Patterns Worth Adopting

  • Confirmation/flagging feedback — lightweight signals for “this document helped” or “this seems stale” could inform search ranking or kb_health.
  • Session reflection mining — a “reflect and write” workflow that mines a session for knowledge to capture via kb_write.
  • Post-error auto-lookup — hook-based auto-search when agents encounter unfamiliar territory.
  • Domain tag scoring — combining tag-based Jaccard similarity with text-based BM25 could improve kb_query relevance.

Source Metrics

ComponentLanguageFilesCode LinesCommentsTotal Lines
MCP server + team APIPython275,453986,564
DashboardTypeScript + TSX211,574111,735
DocsMarkdown121,5292,280
Total8110,1291,66714,378

Doc-to-code ratio: 0.2x (lean docs relative to codebase, though the 69KB proposal document is the real design investment).

prism-mcp

Source: github.com/dcostenco/prism-mcp Language: TypeScript | Status: v5.1.0, Very Early (solo author)

“Mind Palace” for AI agents — persistent session memory with behavioral learning, time travel, multi-agent sync, and a visual dashboard.

What It Does

prism-mcp gives agents persistent memory across conversations through three layers: an append-only session ledger (what happened), mutable handoff state with optimistic concurrency control (what’s current), and behavioral memory that learns from corrections (what to avoid). High- importance lessons can auto-graduate into .cursorrules / .clauderules.

Key Features

  • 30+ MCP tools across session, memory, search, and dashboard domains
  • Three-layer memory: session ledger, handoff state (OCC versioned), behavioral
  • Three-tier search: FTS5, sqlite-vec vectors, TurboQuant JS fallback
  • TurboQuant: pure-TS embedding compression (ICLR 2026) — 768-dim from 3,072 bytes to ~400 bytes (7x), >90% top-1 retrieval accuracy
  • Time travel via versioned handoff snapshots (memory_checkout)
  • Multi-agent hivemind with role isolation (dev/qa/pm)
  • Behavioral learning: corrections accumulate importance, auto-surface
  • Progressive context loading: quick/standard/deep tiers
  • Web dashboard at localhost:3000 (knowledge graph, timeline, health)
  • Morning briefings after 4+ hours of inactivity
  • SQLite (local) or Supabase (cloud) backends

Comparison to kb-mcp

Aspectprism-mcpkb-mcp
DomainAgent session memoryCurated document search
Content sourceAgent-generated session logsHuman-authored markdown
SearchFTS5 + sqlite-vec + TurboQuantBM25 (Tantivy)
Write modelAppend ledger + upsert handoffkb_write to writable collections
StorageSQLite or SupabaseTantivy index + disk reads
Unique featureBehavioral learning + time travelToken-efficient kb_context
Tool count30+10

Relationship: Different domains entirely. kb-mcp retrieves curated knowledge; prism-mcp persists agent session state. Complementary — an agent would use both simultaneously.

Patterns Worth Adopting

  • Progressive context loading — formalized quick/standard/deep tiers for kb_context could help agents pick the right depth.
  • Optimistic concurrency control — relevant if kb-mcp ever supports concurrent writers to the same collection.
  • Health check with auto-repair — extending kb_health to suggest or apply fixes, not just diagnose.

Source Metrics

ComponentLanguageFilesCode LinesCommentsTotal Lines
Core serverTypeScript6917,0126,41426,141
MigrationsSQL142,2276703,207
TestsPython152,1502682,754
DocsMarkdown91,0091,466
Total12128,3678,58740,977

Doc-to-code ratio: 0.05x. The codebase is large relative to documentation. Notable: single-author v1→v5 in 3 days with 96KB handler files suggests rapid feature accretion. The 30+ tool count is unusually high for an MCP server and may cause prompt bloat.

Agentic Design Patterns

Source: github.com/Mathews-Tom/Agentic-Design-Patterns Type: Reference book (21 chapters) | Status: Active

Comprehensive open-source book covering agentic AI patterns — memory management, learning, MCP, RAG, multi-agent collaboration, tool use, planning, guardrails, and more. Grounded in Google ADK, LangChain, and LangGraph with hands-on code examples.

This page maps the book’s patterns against kb-mcp to identify what we’re doing well, where gaps exist, and what belongs in future work.

Relevant Chapters

ChapterTopicRelevance to kb-mcp
Ch 5Tool Use (Function Calling)Direct — kb-mcp’s 10 MCP tools follow this pattern
Ch 8Memory ManagementCore — defines short-term vs long-term memory architecture
Ch 9Learning and AdaptationInforms future agent memory project
Ch 10Model Context Protocol (MCP)Direct — validates kb-mcp’s MCP implementation
Ch 14Knowledge Retrieval (RAG)Direct — kb-mcp implements the RAG pattern

What kb-mcp Gets Right

MCP Implementation (Ch 10)

The book describes MCP as a “universal adapter” with client-server architecture, tool discovery, and standardized communication. kb-mcp follows this exactly — stdio transport, #[rmcp::tool] with JsonSchema params, structured JSON output.

The book warns about wrapping legacy APIs without making them “agent-friendly” — returning formats agents can’t parse (like PDFs instead of markdown). kb-mcp avoids this: all output is structured JSON, tool descriptions guide agent behavior, and kb_context provides token-efficient previews before full retrieval.

RAG Pattern (Ch 14)

kb-mcp implements the core RAG pipeline the book describes:

  1. Chunking — smart markdown chunking via memvid-core
  2. Embeddings — optional BGE-small-en-v1.5 via hybrid feature
  3. Vector storage — persistent .mv2 files
  4. Retrieval — BM25 + semantic search with RRF fusion

The book’s “Agentic RAG” pattern — where an agent reasons about retrieval quality — maps to how agents use kb-mcp’s progressive disclosure chain: kb_digest (vault overview) then search (find candidates) then kb_context (preview metadata) then get_document (full content). Each step lets the agent decide whether to go deeper.

Tool Use Pattern (Ch 5)

The book’s tool use lifecycle matches kb-mcp exactly:

  1. Tool definitions with descriptions and typed parameters
  2. LLM decides which tool to call based on the task
  3. Structured output (JSON) with the tool result
  4. LLM processes the result and decides next steps

kb-mcp’s “primitives over workflows” approach is validated — tools are composable building blocks, not opinionated workflows. An agent can combine search + kb_query + get_document in whatever order serves the task.

Where Gaps Exist

No Short-Term Memory / Session State (Ch 8)

The book’s primary memory pattern is the dual memory system:

  • Short-term — session context, recent interactions, task state
  • Long-term — persistent knowledge store, searchable repository

kb-mcp provides the long-term side (search, retrieval, export) but has zero session awareness. It doesn’t know what the agent searched for previously, which documents were already retrieved, or what the agent’s current goal is. Every tool call is stateless.

The book’s ADK framework solves this with Session (chat thread), State (temporary key-value data with scoped prefixes), and MemoryService (long-term searchable store). LangGraph uses InMemoryStore with namespaced keys.

Assessment: This is not a gap in kb-mcp — it’s a gap in the system architecture. Session state belongs in the agent framework (ADK, LangGraph, Claude Code), not the knowledge base. kb-mcp is the long-term memory store; the framework provides short-term context.

No Learning Loop (Ch 9)

The book describes agents that improve through:

  • Reinforcement learning — rewards for good outcomes
  • Memory-based learning — recalling past experiences
  • Self-modification — agents editing their own behavior

kb-mcp is completely static — search results don’t improve based on which documents agents actually use. The Q-value pattern from Ori-Mnemos maps to Chapter 9’s “Memory-Based Learning” category.

Assessment: Learning belongs in a future agent memory project, not kb-mcp. See the Ori-Mnemos analysis for the pattern evaluation.

No Memory Type Distinction (Ch 8)

The book identifies three types of long-term memory:

  • Semantic memory — facts and concepts (domain knowledge)
  • Episodic memory — past experiences (successful task patterns)
  • Procedural memory — rules and behaviors (system prompts)

kb-mcp treats all vault content as undifferentiated documents. The section-based organization (concepts/, patterns/, drafts/) is a weak form of semantic categorization, but there’s no support for episodic memory (session transcripts, successful patterns) or procedural memory (agent instructions that evolve).

Assessment: kb-mcp’s vault could map collections to memory types (e.g., a sessions/ collection for episodic, prompts/ for procedural), but the tool doesn’t enforce or leverage the distinction. This is an interesting pattern for the future agent memory project.

No Graph-Based Retrieval (Ch 14 — GraphRAG)

The book describes GraphRAG as superior for “complex questions that require synthesizing data from multiple sources.” kb-mcp has wiki-link parsing in kb_health but doesn’t use links for search ranking. Ori-Mnemos implements this with Personalized PageRank over wiki-link + co-occurrence edges.

Assessment: If kb-mcp ever needs better multi-hop retrieval, the wiki-link graph from kb_health could be reused as a search signal. Low priority — BM25 + vector is sufficient for most knowledge base queries.

What kb-mcp Should NOT Adopt

PatternReason
Session/State managementAgent framework’s job (ADK, LangGraph), not the knowledge base
Self-modification (SICA, Ch 9)Far beyond scope — kb-mcp is a retrieval tool
Cloud memory services (Vertex, Ch 8)kb-mcp is local-first by design
Complex learning pipelines (Ch 9)Belongs in a separate project per the Ori-Mnemos brainstorm

Key Insight: kb-mcp as Knowledge Retrieval, Not Memory

The book’s memory management chapter (Ch 8) defines a dual architecture:

Agent Framework (ADK / LangGraph / Claude Code)
├── Short-term: Session context, state, recent history
└── Long-term: Persistent knowledge store ← kb-mcp serves this role

kb-mcp is a knowledge base server that agents use for long-term knowledge retrieval — domain knowledge, reference material, documented solutions. It is not an agent memory system (as stated in GOALS.md: “Not a replacement for agent memory”). The distinction matters: agent memory includes session state, learned preferences, and identity — things that belong in the agent framework or a dedicated memory system.

This validates both kb-mcp’s focused scope and the conclusion from the Ori-Mnemos brainstorm: agent memory (session state, learning, identity) belongs in a separate project that could use kb-mcp as its knowledge retrieval layer.

One Pattern Worth Exploring

Chapter 9 describes “Knowledge Base Learning Agents” that “leverage RAG to maintain a dynamic knowledge base of problem descriptions and proven solutions.” This is exactly what kb-mcp’s docs/solutions/ directory does via the /ce:compound workflow — but manually. An agent could automate this: after solving a problem, write the solution to the vault via kb_write. The researcher agent already does something similar for external content.

This aligns with the Roadmap’s “Knowledge Capture Tools (Phase 3)” — specialized write tools like kb_capture_session and kb_capture_fix that would automate structured solution capture. The pattern stays within kb-mcp’s identity (it’s writing to a knowledge base, not managing agent state) while enabling the knowledge accumulation loop the book describes.

Landscape Review Process

The AI agent memory space moves fast. This landscape section needs periodic review to stay useful for research and feature planning.

How to Add a New Project

  1. Clone into sandbox/ — gitignored, won’t pollute the repo

    git clone --depth 1 https://github.com/org/project.git sandbox/project
    
  2. Run tokei for codebase metrics

    tokei sandbox/project/
    
  3. Create a book page at book/src/landscape/project-name.md with:

    • One-line description
    • Source URL and language
    • Key features list
    • Comparison table vs kb-mcp
    • Relationship (competitive, complementary, or different domain)
    • “Patterns Worth Adopting” — what we could learn from them
  4. Add to SUMMARY.md under the Landscape section

  5. Update the overview — add to the quick comparison table and codebase metrics table in landscape/overview.md

  6. Update vault/tools/retrieval-landscape.md if the project is an MCP-native retrieval tool

When to Review

  • Monthly: Quick scan for new projects — check GitHub trending, Reddit r/clawdbot, ClawHub, and Hacker News for new MCP memory tools
  • Before planning a new feature: Check if any landscape project already solved it — adopt patterns, don’t reinvent
  • After a major release: Update metrics and comparison tables

What to Look For

When evaluating a new project:

QuestionWhy it matters
Is it MCP-native?Direct comparison to kb-mcp’s tool surface
Local or cloud?Trust model and deployment alignment
What search does it use?BM25, vector, hybrid, ripgrep, or none
Does it support write-back?Agent-driven knowledge capture
What’s the data model?Files, SQLite, cloud API, knowledge graph
What language?Ecosystem alignment (Rust, TypeScript, Python)
What’s unique?Patterns worth adopting for our roadmap

Regenerating Metrics

Requires tokei (brew install tokei).

# Clone all landscape projects (first time only)
cd sandbox
git clone --depth 1 https://github.com/kevin-hs-sohn/hipocampus.git
git clone --depth 1 https://github.com/jimprosser/obsidian-web-mcp.git
git clone --depth 1 https://github.com/alibaizhanov/mengram.git
git clone --depth 1 https://github.com/Bumblebiber/hmem.git
git clone --depth 1 https://github.com/MadAppGang/mnemex.git
cd ..

# Run comparison
just loc-landscape

Update the metrics table in landscape/overview.md with the new numbers.

Current Landscape (as of 2026-03-20)

ProjectGitHubStatus
hipocampuskevin-hs-sohn/hipocampusActive
obsidian-web-mcpjimprosser/obsidian-web-mcpActive
mengramalibaizhanov/mengramActive
hmemBumblebiber/hmemActive
mnemexMadAppGang/mnemexActive

Architecture

Overview

kb-mcp is a Cargo workspace with three crates:

  • kb-core (library) — types, config, indexing, search, formatting. No transport deps.
  • kb-cli (binary kb) — Clap subcommands, JSON to stdout
  • kb-mcp-server (binary kb-mcp) — MCP stdio server via rmcp

Both binaries share all logic through kb-core. The only difference is the transport layer.

flowchart TD
    CLI[kb-cli] --> CORE[kb-core]
    SRV[kb-mcp-server] --> CORE
    CORE --> FMT[format.rs — shared output]
    CORE --> IDX[index.rs — scanning]
    CORE --> SE[search.rs — BM25]
    CLI --> OUT[stdout — JSON]
    SRV --> MCP[rmcp stdio — JSON-RPC]

Startup Sequence

flowchart TD
    A[Load collections.ron] --> A1[Find config file]
    A1 --> A2[Parse RON]
    A2 --> A3[Resolve relative paths]
    A3 --> B[Build document index]
    B --> B1[Walk collection dirs]
    B1 --> B2[Parse YAML frontmatter]
    B2 --> B3[Extract titles + detect sections]
    B3 --> C[Build search engine]
    C --> C1[Create Tantivy schema]
    C1 --> C2[Index all documents in RAM]
    C2 --> C3[Commit index]
    C3 --> E{Enter mode}
    E -->|CLI| F[Parse args, execute, exit]
    E -->|MCP| G[serve stdio, wait for disconnect]

    style A fill:#4a9eff,color:#fff
    style B fill:#4a9eff,color:#fff
    style C fill:#4a9eff,color:#fff
    style E fill:#f59e0b,color:#fff

Module Map

crates/
├── kb-core/src/
│   ├── lib.rs       AppContext, init(), sync_stores()
│   ├── config.rs    RON config loading, path resolution, discovery chain
│   ├── types.rs     Core data types (Document, Section)
│   ├── index.rs     Filesystem scanning, frontmatter parsing, section building
│   ├── store.rs     .mv2 lifecycle, content hashing, incremental sync
│   ├── search.rs    BM25 search engine (memvid-core)
│   ├── format.rs    JSON output structs and serialization helpers
│   ├── query.rs     Frontmatter filtering logic (shared between CLI and MCP)
│   └── write.rs     slugify_title, find_available_path (shared utilities)
├── kb-cli/src/
│   └── main.rs      Clap parser and CLI command dispatch → kb_core::* calls
└── kb-mcp-server/src/
    ├── main.rs      MCP stdio server startup
    ├── server.rs    KbMcpServer struct, auto-reindex, ServerHandler impl
    └── tools/
        ├── mod.rs       Router composition (sections + documents + search + ...)
        ├── sections.rs  list_sections — collection/section inventory
        ├── documents.rs get_document — full content retrieval (fresh from disk)
        ├── search.rs    search — BM25 full-text with auto-reindex
        ├── context.rs   kb_context — frontmatter + summary (token-efficient)
        ├── write.rs     kb_write — create files in writable collections
        ├── reindex.rs   reindex — rebuild index from disk
        ├── digest.rs    kb_digest — vault summary with topics, recency, gap hints
        ├── query.rs     kb_query — frontmatter filtering (tag, status, date, sources)
        ├── export.rs    kb_export — concatenate vault into single markdown document
        └── health.rs    kb_health — vault health diagnostics (quality, orphans, broken links)

Data Model

erDiagram
    Config ||--o{ Collection : contains
    Collection ||--o{ SectionDef : defines
    Collection ||--o{ Document : indexes
    Document }o--o| SectionDef : "belongs to"

    Config {
        string cache_dir
    }
    Collection {
        string name
        string path
        string description
        bool writable
    }
    SectionDef {
        string prefix
        string description
    }
    Document {
        string path
        string title
        string body
        string section
        string collection
        list tags
        map frontmatter
    }
collections.ron
  └── Collection[]
        ├── name: String          unique identifier
        ├── path: String          directory (relative to config)
        ├── description: String   shown in list_sections
        ├── writable: bool        enables kb_write
        └── sections: SectionDef[]
              ├── prefix: String      matches first subdirectory
              └── description: String shown in list_sections

Document (in-memory, from scanning)
  ├── path: String            relative to collection root
  ├── title: String           from H1 heading or filename
  ├── tags: Vec<String>       from YAML frontmatter
  ├── body: String            content without frontmatter
  ├── section: String         first directory component
  ├── collection: String      owning collection name
  └── frontmatter: HashMap    all YAML fields (for kb_context)

Section (derived)
  ├── name: String            directory prefix
  ├── description: String     from RON config (or empty)
  ├── doc_count: usize        documents in this section
  └── collection: String      owning collection name

Config Resolution

The config discovery chain runs in order, first match wins:

1. --config <path>                    explicit CLI flag
2. $KB_MCP_CONFIG                     environment variable
3. ./collections.ron                  current working directory
4. ~/.config/kb-mcp/collections.ron   user default

Collection paths in the RON file resolve relative to the config file’s parent directory. This means the same binary works from any working directory as long as the config paths are correct relative to the config.

Search Architecture

flowchart TD
    Q[Query string] --> QP[QueryParser]
    QP -->|title + body + tags| BM[Tantivy BM25 — in-RAM index]
    BM --> TD[TopDocs — limit × 5 if filtering]
    TD --> PF[Post-filter by collection / section]
    PF --> SG[SnippetGenerator — highlighted excerpts]
    SG --> SR["SearchResult[] { doc_index, score, excerpt }"]

    style Q fill:#4a9eff,color:#fff
    style SR fill:#10b981,color:#fff

The search index is built in RAM on startup. It contains all documents from all collections. Filtering by collection or section happens post-query because Tantivy’s STRING fields support exact match but not efficient pre-filtering in a single query. The 5× over-fetch compensates for post-filter reduction.

Tool Pattern

Each tool follows an identical structure:

#![allow(unused)]
fn main() {
// 1. Params struct — derives Deserialize + JsonSchema
#[derive(Deserialize, JsonSchema)]
pub struct MyParams { ... }

// 2. Router function — returns ToolRouter<KbMcpServer>
pub(crate) fn router() -> ToolRouter<KbMcpServer> {
    KbMcpServer::my_router()
}

// 3. Tool implementation — #[rmcp::tool] on an impl block
#[rmcp::tool_router(router = my_router)]
impl KbMcpServer {
    #[rmcp::tool(name = "my_tool", description = "...")]
    pub(crate) async fn my_tool(
        &self,
        Parameters(params): Parameters<MyParams>,
    ) -> Result<CallToolResult, rmcp::ErrorData> { ... }
}
}

Routers are composed in tools/mod.rs using the + operator:

#![allow(unused)]
fn main() {
sections::router() + documents::router() + search::router() + ...
}

Adding a tool = one new file in kb-mcp-server/src/tools/ + one line in mod.rs + one CLI subcommand in kb-cli/src/main.rs.

State Management

graph LR
    subgraph KbMcpServer
        IDX["index: Arc&lt;RwLock&lt;Index&gt;&gt;"]
        SE["search_engine: Arc&lt;SearchEngine&gt;"]
        COL["collections: Arc&lt;Vec&lt;...&gt;&gt;"]
    end

    R[search / get_document / kb_context] -->|read| IDX
    W[reindex / kb_write] -->|write| IDX
    W -->|rebuild| SE
    R -->|query| SE
    R -->|lookup path| COL

    style IDX fill:#f59e0b,color:#fff
    style SE fill:#4a9eff,color:#fff
    style COL fill:#10b981,color:#fff
  • Index behind RwLock for metadata reads (most tools) with exclusive writes during reindex/kb_write.
  • SearchEngine holds per-collection Memvid handles behind an internal Mutex. Search requires &mut self on Memvid even for reads.
  • get_document reads fresh from disk via server.rs::read_fresh() — the index is only used for path/title lookup, not content serving. Edits are visible immediately without reindex.
  • kb_write creates the file, syncs the collection’s .mv2, and rebuilds the in-memory Index.

Fresh-Read Design

get_document does not return content from the search index. It:

  1. Looks up the document by path or title in the index
  2. Finds the owning collection’s resolved path
  3. Reads the file fresh from disk
  4. Strips frontmatter and returns the body

This ensures content is never stale. The tradeoff is one filesystem read per get_document call, which is negligible for the expected workload.

Write Path

kb_write creates files in writable collections:

flowchart TD
    A[kb_write called] --> B{Collection exists?}
    B -->|no| ERR1[Error: collection not found]
    B -->|yes| C{Writable?}
    C -->|no| ERR2[Error: collection is read-only]
    C -->|yes| D[Generate filename: YYYY-MM-DD-kebab-title.md]
    D --> E{File exists?}
    E -->|yes| F[Append suffix: -2, -3, ...]
    E -->|no| G[Generate YAML frontmatter]
    F --> G
    G --> H[Write file to collection dir]
    H --> I[Rebuild search index]
    I --> J[Return path + metadata as JSON]

    style A fill:#4a9eff,color:#fff
    style J fill:#10b981,color:#fff
    style ERR1 fill:#ef4444,color:#fff
    style ERR2 fill:#ef4444,color:#fff

Persistent Storage (memvid-core)

Search is backed by memvid-core’s .mv2 persistent storage. Each collection gets its own .mv2 file at <cache_dir>/<hash>-<name>.mv2.

  • Startup: opens existing .mv2 files, diffs content hashes against a sidecar .hashes file, and only re-ingests changed documents
  • Smart chunking: memvid-core’s structural chunker segments long documents so queries match specific sections, not entire files
  • Crash-safe WAL: writes go through a write-ahead log inside the .mv2
  • Deduplication: search results are deduplicated by URI — one result per document, highest-scoring chunk wins

The Index (Vec) continues to handle metadata operations (exact path lookup, frontmatter retrieval, section counting). Memvid is the search layer only — this two-layer design keeps the architecture simple while gaining persistent, incremental search.

Hybrid Search (optional)

Enable with cargo build --features hybrid. Adds HNSW vector similarity alongside BM25 via memvid-core’s vec feature.

  • Ingest: LocalTextEmbedder (BGE-small-en-v1.5, 384 dims, local ONNX) generates embeddings at document ingest time via put_with_embedding()
  • Search: Memvid::ask(AskMode::Hybrid) runs BM25 + vector in parallel and fuses results via Reciprocal Rank Fusion (RRF, k=60)
  • Query-time: VecEmbedder adapter wraps the embedder for ask()
  • Feature-gated: All hybrid code behind #[cfg(feature = "hybrid")]. Default build stays BM25-only with no ONNX dependency.

The ONNX model (~34MB) must be present at ~/.cache/memvid/text-models/. In the container, it’s baked into the image at /opt/memvid/text-models/ and symlinked by the entrypoint.

Development Tooling & Methodology

This project doubles as a reference implementation for AI-assisted development — it’s both the tool and a practical example of using it. Every feature was built through structured AI workflows, and the documentation captures how.

Tools in Use

Claude Code

Claude Code is Anthropic’s CLI for Claude. It’s the primary development interface — all code, config, documentation, and vault content was authored through Claude Code sessions.

Key patterns used in this project:

  • Parallel agent dispatch — Spawning research agents to investigate APIs and codebase patterns simultaneously before writing code
  • MCP server integration — kb-mcp registered in .mcp.json gives Claude Code direct access to search and query the vault during development
  • Plan mode — Designing architecture before writing code, then executing with task tracking
  • Memory system — Persistent context across sessions for project decisions and user preferences

Compound Engineering

Compound Engineering is a Claude Code plugin that structures development into a repeating cycle where each unit of work makes subsequent work easier.

Philosophy: 80% planning and review, 20% execution. Prevention over remediation.

Workflow cycle:

/ce:brainstorm → /ce:plan → /ce:work → /ce:review → /ce:compound
CommandPurpose
/ce:brainstormExplore requirements, approaches, and feasibility before committing
/ce:planTransform concepts into detailed, executable implementation strategies
/ce:workExecute plans with feature branches, worktrees, and task tracking
/ce:reviewMulti-agent code evaluation — security, architecture, performance, simplicity
/ce:compoundCapture learnings into docs/solutions/ so future work is faster

How we use it in this project:

  • /ce:brainstorm for exploring the memvid-core integration approach, container agent design, and hybrid search strategy
  • /ce:plan for translating brainstorms into phased implementation plans with acceptance criteria and open questions
  • /ce:work for executing plans with incremental commits and task tracking
  • Parallel research agents for investigating memvid-core’s API, ZeroClaw config format, and ONNX model delivery

The compounding part: Brainstorms and plans are preserved in docs/brainstorms/ and docs/plans/. Each document captures decisions, alternatives considered, and lessons learned — so future sessions start with context instead of rediscovering it.

kb-mcp (Dogfooding)

kb-mcp is both the product and a development tool. During Claude Code sessions, it’s registered as an MCP server in .mcp.json, giving the AI direct access to search the vault.

This means Claude Code can:

  • Search existing vault content before writing new docs (avoid duplicates)
  • Read document metadata via kb_context for token-efficient scanning
  • Verify that new content fits the vault structure
  • Check section coverage and identify gaps

This is the dogfooding principle — the same tool agents use in production is the tool we use during development. If it doesn’t work well for us, it won’t work well for anyone.

Development Workflow

The Brainstorm → Plan → Work Loop

Every significant feature follows this cycle:

1. Brainstorm (/ce:brainstorm)

Explore what to build through collaborative dialogue. Output is a brainstorm document in docs/brainstorms/ capturing decisions, rejected alternatives, and scope boundaries.

2. Plan (/ce:plan)

Transform the brainstorm into an implementation plan with:

  • Phased implementation steps
  • Acceptance criteria (checkboxes)
  • Open questions with defaults
  • API references from research

Output is a plan document in docs/plans/.

3. Work (/ce:work)

Execute the plan on a feature branch:

  • Create tasks from plan phases
  • Implement with incremental commits
  • Test continuously
  • Check off acceptance criteria as completed

4. Review (/ce:review)

Multi-agent code review examining security, architecture, performance, and simplicity. Used for complex or risky changes.

5. Compound (/ce:compound)

Capture what was learned into docs/solutions/ so the next time a similar problem arises, the solution is already documented.

Feature Branch Pattern

# Start from main
git checkout -b feat/feature-name

# Work with incremental commits
git commit -m "feat(scope): description"

# When done, squash merge back to main
git checkout main
git merge --squash feat/feature-name
git commit -m "feat: full description with Co-Authored-By"

# Clean up
git branch -D feat/feature-name
git push origin --delete feat/feature-name

Session Patterns

Starting a session:

# Claude Code has kb-mcp available via .mcp.json
# Search the vault to understand current state
kb-mcp list-sections
kb-mcp search --query "whatever you're working on"

Adding vault content:

  1. Search existing content for gaps
  2. Draft markdown with proper frontmatter (tags, created, updated, sources)
  3. Write via kb_write or directly to the filesystem
  4. Verify with kb-mcp search to confirm indexing

Research agent workflow:

# Build the researcher container
just agent-build

# Research a topic autonomously
just agent-research-topic "topic of interest"

# Review drafts on the host
ls vault/drafts/

# Promote approved drafts to vault sections
mv vault/drafts/good-entry.md vault/concepts/

Project History

This project was built in a single extended Claude Code session:

  1. v2 Standalone Crate — Brainstormed generalizing the in-repo kb-mcp into a standalone project. Scaffolded in sandbox/, ported all 6 tools, added RON config, pushed to GitHub.

  2. Persistent Storage — Replaced in-memory Tantivy with memvid-core .mv2 persistent files. Added incremental reindex via blake3 content hashing.

  3. Containerized Researcher Agent — Built a ZeroClaw container with kb-mcp + DuckDuckGo web search. Agent writes research drafts to vault/drafts/ for human review.

  4. Hybrid Search — Added opt-in BM25 + vector search via memvid-core vec feature. Local ONNX embeddings with RRF fusion.

Each phase followed the brainstorm → plan → work loop. All brainstorms and plans are in docs/brainstorms/ and docs/plans/.

Why This Matters

This project demonstrates that a single developer with Claude Code and structured workflows can build and maintain a complex system — Rust MCP server, persistent search engine, containerized agent, hybrid vector search, Obsidian vault, mdBook documentation — that would traditionally require a team and weeks of work.

The key enablers:

  1. Structured planning — Compound Engineering’s brainstorm/plan/work cycle prevents the “just start coding” trap
  2. Parallel research — Multiple agents investigate APIs, crate docs, and codebase patterns simultaneously
  3. MCP integration — The knowledge base is queryable during development, not just at runtime
  4. Dogfooding — Using the same tools in development that agents use in production catches design issues early
  5. Knowledge compounding — Every brainstorm, plan, and solution is preserved so future sessions start with context

Adding Tools

Steps

  1. Create crates/kb-mcp-server/src/tools/my_tool.rs
  2. Add pub(crate) mod my_tool; to crates/kb-mcp-server/src/tools/mod.rs
  3. Add + my_tool::router() to the combined_router() function
  4. Add a CLI subcommand in crates/kb-cli/src/main.rs
  5. If shared logic is needed, add it to crates/kb-core/src/
  6. Update server instructions in crates/kb-mcp-server/src/server.rs

Tool Template

#![allow(unused)]
fn main() {
use rmcp::handler::server::wrapper::Parameters;
use rmcp::model::CallToolResult;
use schemars::JsonSchema;
use serde::Deserialize;

use crate::server::KbMcpServer;

#[derive(Debug, Deserialize, JsonSchema)]
pub struct MyToolParams {
    /// Description shown in tool schema
    pub query: String,
}

pub(crate) fn router() -> rmcp::handler::server::router::tool::ToolRouter<KbMcpServer> {
    KbMcpServer::my_tool_router()
}

#[rmcp::tool_router(router = my_tool_router)]
impl KbMcpServer {
    #[rmcp::tool(
        name = "my_tool",
        description = "What this tool does."
    )]
    pub(crate) async fn my_tool(
        &self,
        Parameters(params): Parameters<MyToolParams>,
    ) -> Result<CallToolResult, rmcp::ErrorData> {
        // Call kb_core functions for shared logic
        Ok(CallToolResult::success(vec![
            rmcp::model::Content::text("result"),
        ]))
    }
}
}

Key Points

  • Params struct must derive Deserialize + JsonSchema
  • Use #[schemars(description = "...")] for field descriptions in the tool schema
  • Use #[serde(default)] for optional fields
  • Return CallToolResult::error(...) with actionable messages for user-facing errors
  • Every tool must have a corresponding CLI subcommand for testing parity
  • Shared logic (filtering, formatting, utilities) belongs in kb-core, not in tool files

Researcher Agent

A containerized agent that uses kb-mcp to discover new content about AI agent memory and curate it into the vault.

Prerequisites

  • Docker + Docker Compose
  • Host Ollama (dev) or Anthropic/OpenAI API key (prod)
  • Web search uses DuckDuckGo — no API key needed

Setup

# 1. Copy provider config
cp agents/researcher/config/config.toml.ollama.example agents/researcher/config/config.toml

# 2. (Optional) Create .env for cloud LLM provider
echo "ANTHROPIC_API_KEY=sk-ant-..." > .env

# 3. Build the container
just agent-build

Usage

# Interactive research session
just agent-research

# Research a specific topic
just agent-research-topic "HNSW vector search performance"

# Check what the vault contains
just agent-vault-status

How It Works

  1. Agent receives a research topic (manual prompt)
  2. Searches existing vault via kb-mcp (avoids duplicates)
  3. Searches the web via DuckDuckGo (Earl template)
  4. Fetches and reads promising sources
  5. Synthesizes a vault entry with proper frontmatter + source citations
  6. Writes to the vault via kb_write
  7. You review the new entry on the host and git commit

Container Security

  • Read-only root filesystem
  • Non-root user (uid 1001)
  • Named volume for runtime workspace
  • IDENTITY.md and SOUL.md mounted read-only
  • tmpfs for temp files (64MB cap)
  • Localhost-only ports
  • API keys via environment variables (not baked into image)

Agent Identity

  • IDENTITY.md — defines the research domain, available tools, and boundaries
  • SOUL.md — quality principles: primary sources first, cite everything, flag uncertainty

Resources