OpenBrain MCP Server

High-performance vector memory for AI agents

OpenBrain is a Model Context Protocol (MCP) server that provides AI agents with a persistent, semantic memory system. It uses local ONNX-based embeddings and PostgreSQL with pgvector for efficient similarity search.

Features

  • 🧠 Semantic Memory: Store and retrieve memories using vector similarity search
  • 🏠 Local Embeddings: No external API calls - uses ONNX runtime with all-MiniLM-L6-v2
  • 🐘 PostgreSQL + pgvector: Production-grade vector storage with HNSW indexing
  • 🔌 MCP Protocol: Streamable HTTP plus legacy HTTP+SSE compatibility
  • 🔐 Multi-Agent Support: Isolated memory namespaces per agent
  • ♻️ Deduplicated Ingest: Near-duplicate facts are merged instead of stored repeatedly
  • High Performance: Rust implementation with async I/O

MCP Tools

Tool Description
store Store a memory with automatic embedding generation, optional TTL, and automatic deduplication
batch_store Store 1-50 memories atomically in a single call with the same deduplication rules
query Search memories by semantic similarity
purge Delete memories by agent ID or time range

Quick Start

Prerequisites

  • Rust 1.75+
  • PostgreSQL 14+ with pgvector extension
  • ONNX model files (all-MiniLM-L6-v2)

Database Setup

CREATE ROLE openbrain_svc LOGIN PASSWORD 'change-me';
CREATE DATABASE openbrain OWNER openbrain_svc;
\c openbrain
CREATE EXTENSION IF NOT EXISTS vector;

Use the same PostgreSQL role for the app and for migrations. Do not create the memories table manually as postgres or another owner and then run OpenBrain as openbrain_svc, because later ALTER TABLE migrations will fail with must be owner of table memories.

Configuration

cp .env.example .env
# Edit .env with your database credentials

Build & Run

cargo build --release
./target/release/openbrain-mcp migrate
./target/release/openbrain-mcp

Database Migrations

This project uses refinery with embedded SQL migrations in migrations/.

Run pending migrations explicitly before starting or restarting the service:

./target/release/openbrain-mcp migrate

If you use the deploy script or CI workflow in .gitea/deploy.sh and .gitea/workflows/ci-cd.yaml, they already run this for you.

E2E Test Modes

The end-to-end test suite supports two modes:

  • Local mode: default. Assumes the test process can manage schema setup against a local PostgreSQL instance and, for one auth-only test, spawn a local openbrain-mcp child process.
  • Remote mode: set OPENBRAIN_E2E_REMOTE=true and point OPENBRAIN_E2E_BASE_URL at a deployed server such as http://76.13.116.52:3100 or https://ob.ingwaz.work. In this mode the suite does not try to create schema locally and skips the local process auth smoke test.

Recommended env for VPS-backed runs:

OPENBRAIN_E2E_REMOTE=true
OPENBRAIN_E2E_BASE_URL=https://ob.ingwaz.work
OPENBRAIN__AUTH__ENABLED=true

TTL / Expiry

Transient facts can be stored with an optional ttl string on store, or on either the batch itself or individual entries for batch_store.

Supported units:

  • s seconds
  • m minutes
  • h hours
  • d days
  • w weeks

Examples:

  • 30s
  • 15m
  • 1h
  • 7d

Expired memories are filtered from query immediately, even before the background cleanup loop deletes them physically. The cleanup interval is configured with OPENBRAIN__TTL__CLEANUP_INTERVAL_SECONDS and defaults to 300.

The CI workflow uses this remote mode after main deploys so e2e coverage validates the VPS deployment rather than the local runner host. It now generates a random per-run e2e key, temporarily appends it to the deployed OPENBRAIN__AUTH__API_KEYS, runs the suite, then removes the key and restarts the service.

For live deployments, keep OPENBRAIN__AUTH__API_KEYS for persistent non-test access only. The server accepts a comma-separated key list, so a practical split is:

  • prod_live_key for normal agent traffic
  • smoke_test_key for ad hoc diagnostics

In Gitea Actions, that means:

  • repo secret OPENBRAIN__AUTH__API_KEYS=prod_live_key,smoke_test_key

If you want prod e2e coverage without leaving a standing CI key on the server, the workflow-generated ephemeral key handles that automatically.

Deduplication on Ingest

OpenBrain checks every store and batch_store write for an existing memory in the same agent_id namespace whose vector similarity meets the configured dedup threshold.

Default behavior:

  • deduplication is always on
  • only same-agent memories are considered
  • expired memories are ignored
  • if a duplicate is found, the existing memory is refreshed instead of inserting a new row
  • metadata is merged with new keys overriding old values
  • created_at is updated to now()
  • expires_at is preserved unless the new write supplies a fresh TTL

Configure the threshold with either:

  • OPENBRAIN__DEDUP__THRESHOLD=0.90
  • DEDUP_THRESHOLD=0.90

Tool responses expose whether a write deduplicated an existing row via the deduplicated flag. batch_store also returns a status of either stored or deduplicated per entry.

Agent Zero Developer Prompt

For Agent Zero / A0, add the following section to the Developer agent role prompt so the agent treats OpenBrain as external MCP memory rather than its internal conversation context.

Recommended target file in A0:

/a0/agents/developer/prompts/agent.system.main.role.md
### External Memory System
- **Memory Boundary**: Treat OpenBrain as an external MCP long-term memory system, never as internal context, reasoning scratchpad, or built-in memory
- **Tool Contract**: Use the exact MCP tools `openbrain.store`, `openbrain.query`, and `openbrain.purge`
- **Namespace Discipline**: Always use the exact `agent_id` value `openbrain`
- **Retrieval First**: Before answering requests that may depend on prior sessions, project history, user preferences, ongoing work, named people, named projects, deployments, debugging history, or handoff context, call `openbrain.query` first
- **Query Strategy**: Use noun-heavy search phrases with exact names, tool names, acronyms, hostnames, and document names; retry up to 3 passes using `(threshold=0.25, limit=5)`, then `(threshold=0.10, limit=8)`, then `(threshold=0.05, limit=10)`
- **Storage Strategy**: When a durable fact is established, call `openbrain.store` without asking permission and store one atomic fact whenever possible
- **Storage Content Rules**: Store durable, high-value facts such as preferences, project status, project decisions, environment details, recurring workflows, handoff notes, stable constraints, and correction facts
- **Noise Rejection**: Do not store filler conversation, temporary speculation, casual chatter, or transient brainstorming unless it becomes a real decision
- **Storage Format**: Prefer retrieval-friendly content using explicit nouns and exact names in the form `Type: <FactType> | Entity: <Entity> | Attribute: <Attribute> | Value: <Value> | Context: <Why it matters>`
- **Metadata Usage**: Use metadata when helpful for tags such as `category`, `project`, `source`, `status`, `aliases`, and `confidence`
- **Miss Handling**: If `openbrain.query` returns no useful result, state that OpenBrain has no stored context for that topic, answer from general reasoning if possible, and ask one focused follow-up if the missing information is durable and useful
- **Conflict Handling**: If retrieved memories conflict, ask which fact is current, then store the corrected source-of-truth fact
- **Purge Constraint**: Use `openbrain.purge` cautiously because it is coarse-grained; it deletes by `agent_id` and optionally before a timestamp, not by individual memory ID
- **Correction Policy**: For ordinary corrections, prefer storing the new source-of-truth fact instead of purging unless the user explicitly asks for cleanup or reset

MCP Integration

OpenBrain exposes both MCP HTTP transports:

Streamable HTTP Endpoint: http://localhost:3100/mcp
Legacy SSE Endpoint: http://localhost:3100/mcp/sse
Legacy Message Endpoint: http://localhost:3100/mcp/message
Health Check: http://localhost:3100/mcp/health

Use the streamable HTTP endpoint for modern clients such as Codex. Keep the legacy SSE endpoints for older MCP clients that still use the deprecated 2024-11-05 HTTP+SSE transport.

Header roles:

  • X-Agent-ID is the memory namespace. Keep this stable if multiple clients should share the same OpenBrain memories.
  • X-Agent-Type is an optional client profile label for logging and config clarity, such as agent-zero or codex.

Example: Codex Configuration

[mcp_servers.openbrain]
url = "https://ob.ingwaz.work/mcp"
http_headers = { "X-API-Key" = "YOUR_OPENBRAIN_API_KEY", "X-Agent-ID" = "openbrain", "X-Agent-Type" = "codex" }

Example: Agent Zero Configuration

{
  "mcpServers": {
    "openbrain": {
      "url": "https://ob.ingwaz.work/mcp/sse",
      "headers": {
        "X-API-Key": "YOUR_OPENBRAIN_API_KEY",
        "X-Agent-ID": "openbrain",
        "X-Agent-Type": "agent-zero"
      }
    }
  }
}

Agent Zero should keep using the legacy HTTP+SSE transport unless and until its client runtime supports streamable HTTP. Codex should use /mcp.

Example: Store a Memory

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "store",
    "arguments": {
      "content": "The user prefers dark mode and uses vim keybindings",
      "agent_id": "assistant-1",
      "ttl": "7d",
      "metadata": {"source": "preferences"}
    }
  }
}

Example: Query Memories

{
  "jsonrpc": "2.0",
  "id": 2,
  "method": "tools/call",
  "params": {
    "name": "query",
    "arguments": {
      "query": "What are the user's editor preferences?",
      "agent_id": "assistant-1",
      "limit": 5,
      "threshold": 0.6
    }
  }
}

Example: Batch Store Memories

{
  "jsonrpc": "2.0",
  "id": 3,
  "method": "tools/call",
  "params": {
    "name": "batch_store",
    "arguments": {
      "agent_id": "assistant-1",
      "entries": [
        {
          "content": "The user prefers dark mode",
          "ttl": "24h",
          "metadata": {"category": "preference"}
        },
        {
          "content": "The user uses vim keybindings",
          "metadata": {"category": "preference"}
        }
      ]
    }
  }
}

Architecture

┌─────────────────────────────────────────────────────────┐
│                    AI Agent                              │
└─────────────────────┬───────────────────────────────────┘
                      │ MCP Protocol (Streamable HTTP / Legacy SSE)
┌─────────────────────▼───────────────────────────────────┐
│              OpenBrain MCP Server                        │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐      │
│  │   store     │  │   query     │  │   purge     │      │
│  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘      │
│         │                │                │              │
│  ┌──────▼────────────────▼────────────────▼──────┐      │
│  │           Embedding Engine (ONNX)              │      │
│  │           all-MiniLM-L6-v2 (384d)              │      │
│  └──────────────────────┬────────────────────────┘      │
│                         │                                │
│  ┌──────────────────────▼────────────────────────┐      │
│  │         PostgreSQL + pgvector                  │      │
│  │         HNSW Index for fast search             │      │
│  └────────────────────────────────────────────────┘      │
└─────────────────────────────────────────────────────────┘

Environment Variables

Variable Default Description
OPENBRAIN__SERVER__HOST 0.0.0.0 Server bind address
OPENBRAIN__SERVER__PORT 3100 Server port
OPENBRAIN__DATABASE__HOST localhost PostgreSQL host
OPENBRAIN__DATABASE__PORT 5432 PostgreSQL port
OPENBRAIN__DATABASE__NAME openbrain Database name
OPENBRAIN__DATABASE__USER - Database user
OPENBRAIN__DATABASE__PASSWORD - Database password
OPENBRAIN__EMBEDDING__MODEL_PATH models/all-MiniLM-L6-v2 ONNX model path
OPENBRAIN__AUTH__ENABLED false Enable API key auth

License

MIT

Description
No description provided
Readme 265 KiB
Languages
Rust 91.9%
Shell 7.4%
PLpgSQL 0.7%