# OpenBrain MCP Server **High-performance vector memory for AI agents** OpenBrain is a Model Context Protocol (MCP) server that provides AI agents with a persistent, semantic memory system. It uses local ONNX-based embeddings and PostgreSQL with pgvector for efficient similarity search. ## Features - 🧠 **Semantic Memory**: Store and retrieve memories using vector similarity search - 🏠 **Local Embeddings**: No external API calls - uses ONNX runtime with all-MiniLM-L6-v2 - 🐘 **PostgreSQL + pgvector**: Production-grade vector storage with HNSW indexing - 🔌 **MCP Protocol**: Streamable HTTP plus legacy HTTP+SSE compatibility - 🔐 **Multi-Agent Support**: Isolated memory namespaces per agent - ♻️ **Deduplicated Ingest**: Near-duplicate facts are merged instead of stored repeatedly - ⚡ **High Performance**: Rust implementation with async I/O ## MCP Tools | Tool | Description | |------|-------------| | `store` | Store a memory with automatic embedding generation, optional TTL, and automatic deduplication | | `batch_store` | Store 1-50 memories atomically in a single call with the same deduplication rules | | `query` | Search memories by semantic similarity | | `purge` | Delete memories by agent ID or time range | ## Quick Start ### Prerequisites - Rust 1.75+ - PostgreSQL 14+ with pgvector extension - ONNX model files (all-MiniLM-L6-v2) ### Database Setup ```sql CREATE ROLE openbrain_svc LOGIN PASSWORD 'change-me'; CREATE DATABASE openbrain OWNER openbrain_svc; \c openbrain CREATE EXTENSION IF NOT EXISTS vector; ``` Use the same PostgreSQL role for the app and for migrations. Do not create the `memories` table manually as `postgres` or another owner and then run OpenBrain as `openbrain_svc`, because later `ALTER TABLE` migrations will fail with `must be owner of table memories`. ### Configuration ```bash cp .env.example .env # Edit .env with your database credentials ``` ### Build & Run ```bash cargo build --release ./target/release/openbrain-mcp migrate ./target/release/openbrain-mcp ``` ### Database Migrations This project uses `refinery` with embedded SQL migrations in `migrations/`. Run pending migrations explicitly before starting or restarting the service: ```bash ./target/release/openbrain-mcp migrate ``` If you use the deploy script or CI workflow in [`.gitea/deploy.sh`](/Users/bobbytables/ai/openbrain-mcp/.gitea/deploy.sh) and [`.gitea/workflows/ci-cd.yaml`](/Users/bobbytables/ai/openbrain-mcp/.gitea/workflows/ci-cd.yaml), they already run this for you. ### E2E Test Modes The end-to-end test suite supports two modes: - Local mode: default. Assumes the test process can manage schema setup against a local PostgreSQL instance and, for one auth-only test, spawn a local `openbrain-mcp` child process. - Remote mode: set `OPENBRAIN_E2E_REMOTE=true` and point `OPENBRAIN_E2E_BASE_URL` at a deployed server such as `http://76.13.116.52:3100` or `https://ob.ingwaz.work`. In this mode the suite does not try to create schema locally and skips the local process auth smoke test. Recommended env for VPS-backed runs: ```bash OPENBRAIN_E2E_REMOTE=true OPENBRAIN_E2E_BASE_URL=https://ob.ingwaz.work OPENBRAIN__AUTH__ENABLED=true ``` ### TTL / Expiry Transient facts can be stored with an optional `ttl` string on `store`, or on either the batch itself or individual entries for `batch_store`. Supported units: - `s` seconds - `m` minutes - `h` hours - `d` days - `w` weeks Examples: - `30s` - `15m` - `1h` - `7d` Expired memories are filtered from `query` immediately, even before the background cleanup loop deletes them physically. The cleanup interval is configured with `OPENBRAIN__TTL__CLEANUP_INTERVAL_SECONDS` and defaults to 300. The CI workflow uses this remote mode after `main` deploys so e2e coverage validates the VPS deployment rather than the local runner host. It now generates a random per-run e2e key, temporarily appends it to the deployed `OPENBRAIN__AUTH__API_KEYS`, runs the suite, then removes the key and restarts the service. For live deployments, keep `OPENBRAIN__AUTH__API_KEYS` for persistent non-test access only. The server accepts a comma-separated key list, so a practical split is: - `prod_live_key` for normal agent traffic - `smoke_test_key` for ad hoc diagnostics In Gitea Actions, that means: - repo secret `OPENBRAIN__AUTH__API_KEYS=prod_live_key,smoke_test_key` If you want prod e2e coverage without leaving a standing CI key on the server, the workflow-generated ephemeral key handles that automatically. ### Deduplication on Ingest OpenBrain checks every `store` and `batch_store` write for an existing memory in the same `agent_id` namespace whose vector similarity meets the configured dedup threshold. Default behavior: - deduplication is always on - only same-agent memories are considered - expired memories are ignored - if a duplicate is found, the existing memory is refreshed instead of inserting a new row - metadata is merged with new keys overriding old values - `created_at` is updated to `now()` - `expires_at` is preserved unless the new write supplies a fresh TTL Configure the threshold with either: - `OPENBRAIN__DEDUP__THRESHOLD=0.90` - `DEDUP_THRESHOLD=0.90` Tool responses expose whether a write deduplicated an existing row via the `deduplicated` flag. `batch_store` also returns a `status` of either `stored` or `deduplicated` per entry. ## Agent Zero Developer Prompt For Agent Zero / A0, add the following section to the Developer agent role prompt so the agent treats OpenBrain as external MCP memory rather than its internal conversation context. Recommended target file in A0: ```text /a0/agents/developer/prompts/agent.system.main.role.md ``` ```md ### External Memory System - **Memory Boundary**: Treat OpenBrain as an external MCP long-term memory system, never as internal context, reasoning scratchpad, or built-in memory - **Tool Contract**: Use the exact MCP tools `openbrain.store`, `openbrain.query`, and `openbrain.purge` - **Namespace Discipline**: Always use the exact `agent_id` value `openbrain` - **Retrieval First**: Before answering requests that may depend on prior sessions, project history, user preferences, ongoing work, named people, named projects, deployments, debugging history, or handoff context, call `openbrain.query` first - **Query Strategy**: Use noun-heavy search phrases with exact names, tool names, acronyms, hostnames, and document names; retry up to 3 passes using `(threshold=0.25, limit=5)`, then `(threshold=0.10, limit=8)`, then `(threshold=0.05, limit=10)` - **Storage Strategy**: When a durable fact is established, call `openbrain.store` without asking permission and store one atomic fact whenever possible - **Storage Content Rules**: Store durable, high-value facts such as preferences, project status, project decisions, environment details, recurring workflows, handoff notes, stable constraints, and correction facts - **Noise Rejection**: Do not store filler conversation, temporary speculation, casual chatter, or transient brainstorming unless it becomes a real decision - **Storage Format**: Prefer retrieval-friendly content using explicit nouns and exact names in the form `Type: | Entity: | Attribute: | Value: | Context: ` - **Metadata Usage**: Use metadata when helpful for tags such as `category`, `project`, `source`, `status`, `aliases`, and `confidence` - **Miss Handling**: If `openbrain.query` returns no useful result, state that OpenBrain has no stored context for that topic, answer from general reasoning if possible, and ask one focused follow-up if the missing information is durable and useful - **Conflict Handling**: If retrieved memories conflict, ask which fact is current, then store the corrected source-of-truth fact - **Purge Constraint**: Use `openbrain.purge` cautiously because it is coarse-grained; it deletes by `agent_id` and optionally before a timestamp, not by individual memory ID - **Correction Policy**: For ordinary corrections, prefer storing the new source-of-truth fact instead of purging unless the user explicitly asks for cleanup or reset ``` ## MCP Integration OpenBrain exposes both MCP HTTP transports: ``` Streamable HTTP Endpoint: http://localhost:3100/mcp Legacy SSE Endpoint: http://localhost:3100/mcp/sse Legacy Message Endpoint: http://localhost:3100/mcp/message Health Check: http://localhost:3100/mcp/health ``` Use the streamable HTTP endpoint for modern clients such as Codex. Keep the legacy SSE endpoints for older MCP clients that still use the deprecated 2024-11-05 HTTP+SSE transport. Header roles: - `X-Agent-ID` is the memory namespace. Keep this stable if multiple clients should share the same OpenBrain memories. - `X-Agent-Type` is an optional client profile label for logging and config clarity, such as `agent-zero` or `codex`. ### Example: Codex Configuration ```toml [mcp_servers.openbrain] url = "https://ob.ingwaz.work/mcp" http_headers = { "X-API-Key" = "YOUR_OPENBRAIN_API_KEY", "X-Agent-ID" = "openbrain", "X-Agent-Type" = "codex" } ``` ### Example: Agent Zero Configuration ```json { "mcpServers": { "openbrain": { "url": "https://ob.ingwaz.work/mcp/sse", "headers": { "X-API-Key": "YOUR_OPENBRAIN_API_KEY", "X-Agent-ID": "openbrain", "X-Agent-Type": "agent-zero" } } } } ``` Agent Zero should keep using the legacy HTTP+SSE transport unless and until its client runtime supports streamable HTTP. Codex should use `/mcp`. ### Example: Store a Memory ```json { "jsonrpc": "2.0", "id": 1, "method": "tools/call", "params": { "name": "store", "arguments": { "content": "The user prefers dark mode and uses vim keybindings", "agent_id": "assistant-1", "ttl": "7d", "metadata": {"source": "preferences"} } } } ``` ### Example: Query Memories ```json { "jsonrpc": "2.0", "id": 2, "method": "tools/call", "params": { "name": "query", "arguments": { "query": "What are the user's editor preferences?", "agent_id": "assistant-1", "limit": 5, "threshold": 0.6 } } } ``` ### Example: Batch Store Memories ```json { "jsonrpc": "2.0", "id": 3, "method": "tools/call", "params": { "name": "batch_store", "arguments": { "agent_id": "assistant-1", "entries": [ { "content": "The user prefers dark mode", "ttl": "24h", "metadata": {"category": "preference"} }, { "content": "The user uses vim keybindings", "metadata": {"category": "preference"} } ] } } } ``` ## Architecture ``` ┌─────────────────────────────────────────────────────────┐ │ AI Agent │ └─────────────────────┬───────────────────────────────────┘ │ MCP Protocol (Streamable HTTP / Legacy SSE) ┌─────────────────────▼───────────────────────────────────┐ │ OpenBrain MCP Server │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ store │ │ query │ │ purge │ │ │ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │ │ │ │ │ │ │ ┌──────▼────────────────▼────────────────▼──────┐ │ │ │ Embedding Engine (ONNX) │ │ │ │ all-MiniLM-L6-v2 (384d) │ │ │ └──────────────────────┬────────────────────────┘ │ │ │ │ │ ┌──────────────────────▼────────────────────────┐ │ │ │ PostgreSQL + pgvector │ │ │ │ HNSW Index for fast search │ │ │ └────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────┘ ``` ## Environment Variables | Variable | Default | Description | |----------|---------|-------------| | `OPENBRAIN__SERVER__HOST` | `0.0.0.0` | Server bind address | | `OPENBRAIN__SERVER__PORT` | `3100` | Server port | | `OPENBRAIN__DATABASE__HOST` | `localhost` | PostgreSQL host | | `OPENBRAIN__DATABASE__PORT` | `5432` | PostgreSQL port | | `OPENBRAIN__DATABASE__NAME` | `openbrain` | Database name | | `OPENBRAIN__DATABASE__USER` | - | Database user | | `OPENBRAIN__DATABASE__PASSWORD` | - | Database password | | `OPENBRAIN__EMBEDDING__MODEL_PATH` | `models/all-MiniLM-L6-v2` | ONNX model path | | `OPENBRAIN__AUTH__ENABLED` | `false` | Enable API key auth | ## License MIT