NocturnLabs
Projects

Opencode Personal Knowledge

Opencode Personal Knowledge is a dual-interface knowledge management system designed for AI agents and power users. It combines a robust CLI for manual management with a Model Context Protocol (MCP) server that exposes your knowledge base to AI assistants (like Claude or Opencode Agents), enabling them to semantically search and retrieve information from your personal library.

Opencode Personal Knowledge

Opencode Personal Knowledge is a dual-interface knowledge management system designed for AI agents and power users. It combines a robust CLI for manual management with a Model Context Protocol (MCP) server that exposes your knowledge base to AI assistants (like Claude or Opencode Agents), enabling them to semantically search and retrieve information from your personal library.

!NOTE This tool uses hybrid search, combining standard SQLite text matching with LanceDB vector embeddings (via FastEmbed) for semantic understanding, ensuring high recall for both exact keywords and conceptual queries.

Installation

Prerequisites

  • Bun (v1.0.0 or higher) - Install Bun
  • Linux/macOS (Windows support is experimental via WSL)

Clone the repository and install dependencies:

git clone https://github.com/NocturnLabs/opencode-personal-knowledge.git
cd opencode-personal-knowledge
bun install

Build the project:

bun run build

Link the binary globally:

bun link

Option 2: Running via bunx

You can run commands directly without installation:

bunx opencode-personal-knowledge --help

Getting Started

1. Initialize & Add Entry

Start by adding your first knowledge entry. The database is automatically initialized on the first write.

# Add a simple text entry
pk add "TypeScript interface vs type" "Interfaces are better for objects, types for unions." --tags typescript,tips

# Add with source reference
pk add "React useEffect" "Runs after render. Cleanup function runs before next effect." --source "https://react.dev"

Once you have entries, vector embeddings are generated automatically. You can search by meaning, not just keywords.

# Search for concepts (even if words don't match exactly)
pk search "difference between types in TS"

3. Start MCP Server

To let AI agents access your knowledge, run the MCP server.

# Start the server (usually done by your AI client config)
pk mcp
# OR directly:
bun run src/mcp-server.ts

Core Features

Hybrid Storage Architecture

The system uses a two-layer storage approach to maximize reliability and searchability:

  1. Primary Storage (SQLite):
    • Engine: bun:sqlite
    • Location: ~/.local/share/opencode-personal-knowledge/knowledge.db
    • Purpose: Stores the canonical text, metadata, tags, and timestamps. It serves as the source of truth.
    • Querying: Used for get, list, stats, and text (keyword) search.
  2. Vector Index (LanceDB):
    • Engine: LanceDB (Embedded)
    • Location: ~/.local/share/opencode-personal-knowledge/vectors
    • Purpose: Stores high-dimensional vector embeddings of your content.
    • Model: BGESmallENV15 (FlagEmbedding), running locally on CPU.
    • Querying: Used for semantic search to find "nearest neighbors" in meaning.

Automatic Syncing

When you add, update, or delete an entry via the CLI or MCP, the service automatically coordinates between SQLite and LanceDB.

  • Writes: Data is immediately written to SQLite. Then, an embedding is generated asynchronously and written to LanceDB.
  • Failures: If vector embedding fails (e.g., model download error), the data remains safe in SQLite. You get a warning, but no data is lost.

Specification & Data Formats

Knowledge Entry Schema

Each entry in the database follows this strict schema:

FieldTypeRequiredDescription
idIntegerYesUnique auto-incrementing identifier.
titleStringYesShort summary or headline.
contentStringYesFull markdown or plain text body.
sourceStringNoURL, filepath, or citation string.
tagsArrayNoList of categorization tags.
created_atISO DateYesCreation timestamp.
updated_atISO DateYesLast modification timestamp.

Vector Record Schema

The vector database stores a subset of data optimized for search:

FieldDescription
vector384-dimensional float array (embedding).
content_previewFirst 500 characters of content.
_distanceInternal metric for similarity ranking.

Configuration & Security

Environment Variables

The tool is zero-config by default, adhering to XDG Base Directory standards. However, you can override defaults:

VariableDefaultDescription
OPENCODE_PK_DATA_DIR~/.local/share/opencode-personal-knowledgeDirectory path for storing knowledge.db and vectors/.

!IMPORTANT Changing OPENCODE_PK_DATA_DIR will result in an empty database unless you migrate existing files manually.

Security Model

  • Local Execution: All data stays on your machine. No external API calls are made for embeddings (FastEmbed runs locally).
  • MCP Permissions: When used as an MCP server, it exposes read/write access to your knowledge base. Ensure you trust the AI client connecting to it.
  • Dependencies: Uses trustedDependencies (protobufjs, @biomejs/biome) for security compliance.

CLI Reference

The CLI is accessed via the pk alias (if linked) or bun run src/index.ts.

pk add

Add a new knowledge entry.

Usage:

pk add <title> <content> [options]

Options:

  • -s, --source <source>: Source URL or reference.
  • -t, --tags <tags>: Comma-separated tags (e.g., "coding,tips").

Example:

$ pk add "Vim Save" "Type :w to save" -t vim,editors
✅ Added entry #42: "Vim Save"
📊 Indexed for semantic search

Search knowledge entries. Defaults to semantic search.

Usage:

pk search <query> [options]

Options:

  • -t, --text: Force keyword-only text search (bypasses vector DB).
  • -l, --limit <number>: Max results (default: 5).

Example:

$ pk search "how to write files in terminal editor"
Found 1 similar entries:

[42] Vim Save (89% similar)
    Type :w to save...
    Tags: vim, editors

pk get

Retrieve a specific entry by ID.

Usage:

pk get <id>

Example:

$ pk get 42
# Vim Save

ID: 42
Created: 2024-12-14T10:00:00.000Z
Updated: 2024-12-14T10:00:00.000Z
Tags: vim, editors

Type :w to save

pk update

Modify an existing entry.

Usage:

pk update <id> [options]

Options:

  • --title <text>
  • --content <text>
  • -s, --source <text>
  • -t, --tags <text>

Example:

$ pk update 42 --content "Type :w to save, :wq to save and quit"
✅ Updated entry #42
📊 Re-indexed

pk list

List all entries chronologically.

Usage:

pk list [options]

Options:

  • -l, --limit <number>: Items per page (default: 20).
  • -o, --offset <number>: Pagination offset.
  • -t, --tags <tags>: Filter by tags.

pk stats

View database health and tag distribution.

Example:

$ pk stats
📊 Knowledge Base Stats

Total Entries: 156
Vectors Indexed: 156
Oldest: 2024-01-15T08:30:00.000Z
Newest: 2024-12-14T10:00:00.000Z

Top Tags:
  typescript: 45
  linux: 30
  ideas: 12

pk vectors

Direct management of the vector database.

Subcommands:

  • convert: Scans SQLite and ensures all items are indexed in LanceDB. Useful after importing bulk data or if indexing failed.
  • stats: Show vector specific stats.
  • clear: Wipe the vector index (does not delete SQLite data).

Example:

$ pk vectors convert
Converting entries to vectors...
Progress: 156/156
✅ Converted 156 entries (0 skipped)

MCP Tools Reference

When running as an MCP server (pk mcp), the following tools are exposed to AI agents.

Knowledge Tools

ToolDescription
store_knowledgeStore a new knowledge entry with optional tags
search_knowledgeSemantic similarity search
search_knowledge_textKeyword-based text search
get_knowledgeRetrieve entry by ID
update_knowledgeUpdate an existing entry
delete_knowledgeDelete an entry
list_knowledgeList entries with filters
get_knowledge_statsDatabase statistics

Session Memory Tools

Log and search across entire OpenCode conversations.

ToolDescription
start_logging_sessionBegin logging a session
log_messageLog a user/agent message to the session
search_sessionSemantic search within a session
search_all_sessionsSearch across ALL logged sessions
list_sessionsList all sessions
get_sessionGet session details and messages
end_sessionEnd session with optional summary

!TIP Session memory enables semantic search across past conversations. Start a session when debugging complex issues to review later.

MCP Usage Example

User: "Start logging this session, call it 'auth debugging'"

Agent: Calls start_logging_session and logs exchanges:

✅ Started session #1: "auth debugging"

Later...

User: "Search this session for JWT"

Agent: Calls search_session:

Found 2 matches in session #1:

### 1. [user] (92% match)
The JWT token expires too fast...

### 2. [agent] (88% match)
The TTL is set to 60 instead of 3600...