Architecture

Overview

kb-mcp is a Cargo workspace with three crates:

kb-core (library) — types, config, indexing, search, formatting. No transport deps.
kb-cli (binary kb) — Clap subcommands, JSON to stdout
kb-mcp-server (binary kb-mcp) — MCP stdio server via rmcp

Both binaries share all logic through kb-core. The only difference is the transport layer.

flowchart TD
    CLI[kb-cli] --> CORE[kb-core]
    SRV[kb-mcp-server] --> CORE
    CORE --> FMT[format.rs — shared output]
    CORE --> IDX[index.rs — scanning]
    CORE --> SE[search.rs — BM25]
    CLI --> OUT[stdout — JSON]
    SRV --> MCP[rmcp stdio — JSON-RPC]

Startup Sequence

flowchart TD
    A[Load collections.ron] --> A1[Find config file]
    A1 --> A2[Parse RON]
    A2 --> A3[Resolve relative paths]
    A3 --> B[Build document index]
    B --> B1[Walk collection dirs]
    B1 --> B2[Parse YAML frontmatter]
    B2 --> B3[Extract titles + detect sections]
    B3 --> C[Build search engine]
    C --> C1[Create Tantivy schema]
    C1 --> C2[Index all documents in RAM]
    C2 --> C3[Commit index]
    C3 --> E{Enter mode}
    E -->|CLI| F[Parse args, execute, exit]
    E -->|MCP| G[serve stdio, wait for disconnect]

    style A fill:#4a9eff,color:#fff
    style B fill:#4a9eff,color:#fff
    style C fill:#4a9eff,color:#fff
    style E fill:#f59e0b,color:#fff

Module Map

crates/
├── kb-core/src/
│   ├── lib.rs       AppContext, init(), sync_stores()
│   ├── config.rs    RON config loading, path resolution, discovery chain
│   ├── types.rs     Core data types (Document, Section)
│   ├── index.rs     Filesystem scanning, frontmatter parsing, section building
│   ├── store.rs     .mv2 lifecycle, content hashing, incremental sync
│   ├── search.rs    BM25 search engine (memvid-core)
│   ├── format.rs    JSON output structs and serialization helpers
│   ├── query.rs     Frontmatter filtering logic (shared between CLI and MCP)
│   └── write.rs     slugify_title, find_available_path (shared utilities)
├── kb-cli/src/
│   └── main.rs      Clap parser and CLI command dispatch → kb_core::* calls
└── kb-mcp-server/src/
    ├── main.rs      MCP stdio server startup
    ├── server.rs    KbMcpServer struct, auto-reindex, ServerHandler impl
    └── tools/
        ├── mod.rs       Router composition (sections + documents + search + ...)
        ├── sections.rs  list_sections — collection/section inventory
        ├── documents.rs get_document — full content retrieval (fresh from disk)
        ├── search.rs    search — BM25 full-text with auto-reindex
        ├── context.rs   kb_context — frontmatter + summary (token-efficient)
        ├── write.rs     kb_write — create files in writable collections
        ├── reindex.rs   reindex — rebuild index from disk
        ├── digest.rs    kb_digest — vault summary with topics, recency, gap hints
        ├── query.rs     kb_query — frontmatter filtering (tag, status, date, sources)
        ├── export.rs    kb_export — concatenate vault into single markdown document
        └── health.rs    kb_health — vault health diagnostics (quality, orphans, broken links)

Data Model

erDiagram
    Config ||--o{ Collection : contains
    Collection ||--o{ SectionDef : defines
    Collection ||--o{ Document : indexes
    Document }o--o| SectionDef : "belongs to"

    Config {
        string cache_dir
    }
    Collection {
        string name
        string path
        string description
        bool writable
    }
    SectionDef {
        string prefix
        string description
    }
    Document {
        string path
        string title
        string body
        string section
        string collection
        list tags
        map frontmatter
    }

collections.ron
  └── Collection[]
        ├── name: String          unique identifier
        ├── path: String          directory (relative to config)
        ├── description: String   shown in list_sections
        ├── writable: bool        enables kb_write
        └── sections: SectionDef[]
              ├── prefix: String      matches first subdirectory
              └── description: String shown in list_sections

Document (in-memory, from scanning)
  ├── path: String            relative to collection root
  ├── title: String           from H1 heading or filename
  ├── tags: Vec<String>       from YAML frontmatter
  ├── body: String            content without frontmatter
  ├── section: String         first directory component
  ├── collection: String      owning collection name
  └── frontmatter: HashMap    all YAML fields (for kb_context)

Section (derived)
  ├── name: String            directory prefix
  ├── description: String     from RON config (or empty)
  ├── doc_count: usize        documents in this section
  └── collection: String      owning collection name

Config Resolution

The config discovery chain runs in order, first match wins:

1. --config <path>                    explicit CLI flag
2. $KB_MCP_CONFIG                     environment variable
3. ./collections.ron                  current working directory
4. ~/.config/kb-mcp/collections.ron   user default

Collection paths in the RON file resolve relative to the config file’s parent directory. This means the same binary works from any working directory as long as the config paths are correct relative to the config.

Search Architecture

flowchart TD
    Q[Query string] --> QP[QueryParser]
    QP -->|title + body + tags| BM[Tantivy BM25 — in-RAM index]
    BM --> TD[TopDocs — limit × 5 if filtering]
    TD --> PF[Post-filter by collection / section]
    PF --> SG[SnippetGenerator — highlighted excerpts]
    SG --> SR["SearchResult[] { doc_index, score, excerpt }"]

    style Q fill:#4a9eff,color:#fff
    style SR fill:#10b981,color:#fff

The search index is built in RAM on startup. It contains all documents from all collections. Filtering by collection or section happens post-query because Tantivy’s STRING fields support exact match but not efficient pre-filtering in a single query. The 5× over-fetch compensates for post-filter reduction.

Tool Pattern

Each tool follows an identical structure:

#![allow(unused)]
fn main() {
// 1. Params struct — derives Deserialize + JsonSchema
#[derive(Deserialize, JsonSchema)]
pub struct MyParams { ... }

// 2. Router function — returns ToolRouter<KbMcpServer>
pub(crate) fn router() -> ToolRouter<KbMcpServer> {
    KbMcpServer::my_router()
}

// 3. Tool implementation — #[rmcp::tool] on an impl block
#[rmcp::tool_router(router = my_router)]
impl KbMcpServer {
    #[rmcp::tool(name = "my_tool", description = "...")]
    pub(crate) async fn my_tool(
        &self,
        Parameters(params): Parameters<MyParams>,
    ) -> Result<CallToolResult, rmcp::ErrorData> { ... }
}
}

Routers are composed in tools/mod.rs using the + operator:

#![allow(unused)]
fn main() {
sections::router() + documents::router() + search::router() + ...
}

Adding a tool = one new file in kb-mcp-server/src/tools/ + one line in mod.rs + one CLI subcommand in kb-cli/src/main.rs.

State Management

graph LR
    subgraph KbMcpServer
        IDX["index: Arc&lt;RwLock&lt;Index&gt;&gt;"]
        SE["search_engine: Arc&lt;SearchEngine&gt;"]
        COL["collections: Arc&lt;Vec&lt;...&gt;&gt;"]
    end

    R[search / get_document / kb_context] -->|read| IDX
    W[reindex / kb_write] -->|write| IDX
    W -->|rebuild| SE
    R -->|query| SE
    R -->|lookup path| COL

    style IDX fill:#f59e0b,color:#fff
    style SE fill:#4a9eff,color:#fff
    style COL fill:#10b981,color:#fff

Index behind RwLock for metadata reads (most tools) with exclusive writes during reindex/kb_write.
SearchEngine holds per-collection Memvid handles behind an internal Mutex. Search requires &mut self on Memvid even for reads.
get_document reads fresh from disk via server.rs::read_fresh() — the index is only used for path/title lookup, not content serving. Edits are visible immediately without reindex.
kb_write creates the file, syncs the collection’s .mv2, and rebuilds the in-memory Index.

Fresh-Read Design

get_document does not return content from the search index. It:

Looks up the document by path or title in the index
Finds the owning collection’s resolved path
Reads the file fresh from disk
Strips frontmatter and returns the body

This ensures content is never stale. The tradeoff is one filesystem read per get_document call, which is negligible for the expected workload.

Write Path

kb_write creates files in writable collections:

flowchart TD
    A[kb_write called] --> B{Collection exists?}
    B -->|no| ERR1[Error: collection not found]
    B -->|yes| C{Writable?}
    C -->|no| ERR2[Error: collection is read-only]
    C -->|yes| D[Generate filename: YYYY-MM-DD-kebab-title.md]
    D --> E{File exists?}
    E -->|yes| F[Append suffix: -2, -3, ...]
    E -->|no| G[Generate YAML frontmatter]
    F --> G
    G --> H[Write file to collection dir]
    H --> I[Rebuild search index]
    I --> J[Return path + metadata as JSON]

    style A fill:#4a9eff,color:#fff
    style J fill:#10b981,color:#fff
    style ERR1 fill:#ef4444,color:#fff
    style ERR2 fill:#ef4444,color:#fff

Persistent Storage (memvid-core)

Search is backed by memvid-core’s .mv2 persistent storage. Each collection gets its own .mv2 file at <cache_dir>/<hash>-<name>.mv2.

Startup: opens existing .mv2 files, diffs content hashes against a sidecar .hashes file, and only re-ingests changed documents
Smart chunking: memvid-core’s structural chunker segments long documents so queries match specific sections, not entire files
Crash-safe WAL: writes go through a write-ahead log inside the .mv2
Deduplication: search results are deduplicated by URI — one result per document, highest-scoring chunk wins

The Index (Vec) continues to handle metadata operations (exact path lookup, frontmatter retrieval, section counting). Memvid is the search layer only — this two-layer design keeps the architecture simple while gaining persistent, incremental search.

Hybrid Search (optional)

Enable with cargo build --features hybrid. Adds HNSW vector similarity alongside BM25 via memvid-core’s vec feature.

Ingest: LocalTextEmbedder (BGE-small-en-v1.5, 384 dims, local ONNX) generates embeddings at document ingest time via put_with_embedding()
Search: Memvid::ask(AskMode::Hybrid) runs BM25 + vector in parallel and fuses results via Reciprocal Rank Fusion (RRF, k=60)
Query-time: VecEmbedder adapter wraps the embedder for ask()
Feature-gated: All hybrid code behind #[cfg(feature = "hybrid")]. Default build stays BM25-only with no ONNX dependency.

The ONNX model (~34MB) must be present at ~/.cache/memvid/text-models/. In the container, it’s baked into the image at /opt/memvid/text-models/ and symlinked by the entrypoint.

Keyboard shortcuts

kb-mcp