Reviewed-on: Ingwaz/openbrain-mcp#26
OpenBrain MCP Server
High-performance vector memory for AI agents
OpenBrain is a Model Context Protocol (MCP) server that provides AI agents with a persistent, semantic memory system. It uses local ONNX-based embeddings and PostgreSQL with pgvector for efficient similarity search.
Features
- 🧠 Semantic Memory: Store and retrieve memories using vector similarity search
- 🏠 Local Embeddings: No external API calls - uses ONNX runtime with all-MiniLM-L6-v2
- 🐘 PostgreSQL + pgvector: Production-grade vector storage with HNSW indexing
- 🔌 MCP Protocol: Streamable HTTP plus legacy HTTP+SSE compatibility
- 🔐 Multi-Agent Support: Isolated memory namespaces per agent
- ♻️ Deduplicated Ingest: Near-duplicate facts are merged instead of stored repeatedly
- ⚡ High Performance: Rust implementation with async I/O
MCP Tools
| Tool | Description |
|---|---|
store |
Store a memory with automatic embedding generation, optional TTL, and automatic deduplication |
batch_store |
Store 1-50 memories atomically in a single call with the same deduplication rules |
query |
Search memories by semantic similarity |
purge |
Delete memories by agent ID or time range |
Quick Start
Prerequisites
- Rust 1.75+
- PostgreSQL 14+ with pgvector extension
- ONNX model files (all-MiniLM-L6-v2)
Database Setup
CREATE ROLE openbrain_svc LOGIN PASSWORD 'change-me';
CREATE DATABASE openbrain OWNER openbrain_svc;
\c openbrain
CREATE EXTENSION IF NOT EXISTS vector;
Use the same PostgreSQL role for the app and for migrations. Do not create the
memories table manually as postgres or another owner and then run
OpenBrain as openbrain_svc, because later ALTER TABLE migrations will fail
with must be owner of table memories.
Configuration
cp .env.example .env
# Edit .env with your database credentials
Build & Run
cargo build --release
./target/release/openbrain-mcp migrate
./target/release/openbrain-mcp
Database Migrations
This project uses refinery with embedded SQL migrations in migrations/.
Run pending migrations explicitly before starting or restarting the service:
./target/release/openbrain-mcp migrate
If you use the deploy script or CI workflow in .gitea/deploy.sh and .gitea/workflows/ci-cd.yaml, they already run this for you.
E2E Test Modes
The end-to-end test suite supports two modes:
- Local mode: default. Assumes the test process can manage schema setup against a local PostgreSQL instance and, for one auth-only test, spawn a local
openbrain-mcpchild process. - Remote mode: set
OPENBRAIN_E2E_REMOTE=trueand pointOPENBRAIN_E2E_BASE_URLat a deployed server such ashttp://76.13.116.52:3100orhttps://ob.ingwaz.work. In this mode the suite does not try to create schema locally and skips the local process auth smoke test.
Recommended env for VPS-backed runs:
OPENBRAIN_E2E_REMOTE=true
OPENBRAIN_E2E_BASE_URL=https://ob.ingwaz.work
OPENBRAIN__AUTH__ENABLED=true
TTL / Expiry
Transient facts can be stored with an optional ttl string on store, or on
either the batch itself or individual entries for batch_store.
Supported units:
ssecondsmminuteshhoursddayswweeks
Examples:
30s15m1h7d
Expired memories are filtered from query immediately, even before the
background cleanup loop deletes them physically. The cleanup interval is
configured with OPENBRAIN__TTL__CLEANUP_INTERVAL_SECONDS and defaults to 300.
The CI workflow uses this remote mode after main deploys so e2e coverage validates the VPS deployment rather than the local runner host. It now generates a random per-run e2e key, temporarily appends it to the deployed OPENBRAIN__AUTH__API_KEYS, runs the suite, then removes the key and restarts the service.
For live deployments, keep OPENBRAIN__AUTH__API_KEYS for persistent non-test access only. The server accepts a comma-separated key list, so a practical split is:
prod_live_keyfor normal agent trafficsmoke_test_keyfor ad hoc diagnostics
In Gitea Actions, that means:
- repo secret
OPENBRAIN__AUTH__API_KEYS=prod_live_key,smoke_test_key
If you want prod e2e coverage without leaving a standing CI key on the server, the workflow-generated ephemeral key handles that automatically.
Deduplication on Ingest
OpenBrain checks every store and batch_store write for an existing memory in
the same agent_id namespace whose vector similarity meets the configured
dedup threshold.
Default behavior:
- deduplication is always on
- only same-agent memories are considered
- expired memories are ignored
- if a duplicate is found, the existing memory is refreshed instead of inserting a new row
- metadata is merged with new keys overriding old values
created_atis updated tonow()expires_atis preserved unless the new write supplies a fresh TTL
Configure the threshold with either:
OPENBRAIN__DEDUP__THRESHOLD=0.90DEDUP_THRESHOLD=0.90
Tool responses expose whether a write deduplicated an existing row via the
deduplicated flag. batch_store also returns a status of either
stored or deduplicated per entry.
Agent Zero Developer Prompt
For Agent Zero / A0, add the following section to the Developer agent role prompt so the agent treats OpenBrain as external MCP memory rather than its internal conversation context.
Recommended target file in A0:
/a0/agents/developer/prompts/agent.system.main.role.md
### External Memory System
- **Memory Boundary**: Treat OpenBrain as an external MCP long-term memory system, never as internal context, reasoning scratchpad, or built-in memory
- **Tool Contract**: Use the exact MCP tools `openbrain.store`, `openbrain.query`, and `openbrain.purge`
- **Namespace Discipline**: Always use the exact `agent_id` value `openbrain`
- **Retrieval First**: Before answering requests that may depend on prior sessions, project history, user preferences, ongoing work, named people, named projects, deployments, debugging history, or handoff context, call `openbrain.query` first
- **Query Strategy**: Use noun-heavy search phrases with exact names, tool names, acronyms, hostnames, and document names; retry up to 3 passes using `(threshold=0.25, limit=5)`, then `(threshold=0.10, limit=8)`, then `(threshold=0.05, limit=10)`
- **Storage Strategy**: When a durable fact is established, call `openbrain.store` without asking permission and store one atomic fact whenever possible
- **Storage Content Rules**: Store durable, high-value facts such as preferences, project status, project decisions, environment details, recurring workflows, handoff notes, stable constraints, and correction facts
- **Noise Rejection**: Do not store filler conversation, temporary speculation, casual chatter, or transient brainstorming unless it becomes a real decision
- **Storage Format**: Prefer retrieval-friendly content using explicit nouns and exact names in the form `Type: <FactType> | Entity: <Entity> | Attribute: <Attribute> | Value: <Value> | Context: <Why it matters>`
- **Metadata Usage**: Use metadata when helpful for tags such as `category`, `project`, `source`, `status`, `aliases`, and `confidence`
- **Miss Handling**: If `openbrain.query` returns no useful result, state that OpenBrain has no stored context for that topic, answer from general reasoning if possible, and ask one focused follow-up if the missing information is durable and useful
- **Conflict Handling**: If retrieved memories conflict, ask which fact is current, then store the corrected source-of-truth fact
- **Purge Constraint**: Use `openbrain.purge` cautiously because it is coarse-grained; it deletes by `agent_id` and optionally before a timestamp, not by individual memory ID
- **Correction Policy**: For ordinary corrections, prefer storing the new source-of-truth fact instead of purging unless the user explicitly asks for cleanup or reset
MCP Integration
OpenBrain exposes both MCP HTTP transports:
Streamable HTTP Endpoint: http://localhost:3100/mcp
Legacy SSE Endpoint: http://localhost:3100/mcp/sse
Legacy Message Endpoint: http://localhost:3100/mcp/message
Health Check: http://localhost:3100/mcp/health
Use the streamable HTTP endpoint for modern clients such as Codex. Keep the legacy SSE endpoints for older MCP clients that still use the deprecated 2024-11-05 HTTP+SSE transport.
Header roles:
X-Agent-IDis the memory namespace. Keep this stable if multiple clients should share the same OpenBrain memories.X-Agent-Typeis an optional client profile label for logging and config clarity, such asagent-zeroorcodex.
Example: Codex Configuration
[mcp_servers.openbrain]
url = "https://ob.ingwaz.work/mcp"
http_headers = { "X-API-Key" = "YOUR_OPENBRAIN_API_KEY", "X-Agent-ID" = "openbrain", "X-Agent-Type" = "codex" }
Example: Agent Zero Configuration
{
"mcpServers": {
"openbrain": {
"url": "https://ob.ingwaz.work/mcp/sse",
"headers": {
"X-API-Key": "YOUR_OPENBRAIN_API_KEY",
"X-Agent-ID": "openbrain",
"X-Agent-Type": "agent-zero"
}
}
}
}
Agent Zero should keep using the legacy HTTP+SSE transport unless and until its
client runtime supports streamable HTTP. Codex should use /mcp.
Example: Store a Memory
{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/call",
"params": {
"name": "store",
"arguments": {
"content": "The user prefers dark mode and uses vim keybindings",
"agent_id": "assistant-1",
"ttl": "7d",
"metadata": {"source": "preferences"}
}
}
}
Example: Query Memories
{
"jsonrpc": "2.0",
"id": 2,
"method": "tools/call",
"params": {
"name": "query",
"arguments": {
"query": "What are the user's editor preferences?",
"agent_id": "assistant-1",
"limit": 5,
"threshold": 0.6
}
}
}
Example: Batch Store Memories
{
"jsonrpc": "2.0",
"id": 3,
"method": "tools/call",
"params": {
"name": "batch_store",
"arguments": {
"agent_id": "assistant-1",
"entries": [
{
"content": "The user prefers dark mode",
"ttl": "24h",
"metadata": {"category": "preference"}
},
{
"content": "The user uses vim keybindings",
"metadata": {"category": "preference"}
}
]
}
}
}
Architecture
┌─────────────────────────────────────────────────────────┐
│ AI Agent │
└─────────────────────┬───────────────────────────────────┘
│ MCP Protocol (Streamable HTTP / Legacy SSE)
┌─────────────────────▼───────────────────────────────────┐
│ OpenBrain MCP Server │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ store │ │ query │ │ purge │ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
│ │ │ │ │
│ ┌──────▼────────────────▼────────────────▼──────┐ │
│ │ Embedding Engine (ONNX) │ │
│ │ all-MiniLM-L6-v2 (384d) │ │
│ └──────────────────────┬────────────────────────┘ │
│ │ │
│ ┌──────────────────────▼────────────────────────┐ │
│ │ PostgreSQL + pgvector │ │
│ │ HNSW Index for fast search │ │
│ └────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
Environment Variables
| Variable | Default | Description |
|---|---|---|
OPENBRAIN__SERVER__HOST |
0.0.0.0 |
Server bind address |
OPENBRAIN__SERVER__PORT |
3100 |
Server port |
OPENBRAIN__DATABASE__HOST |
localhost |
PostgreSQL host |
OPENBRAIN__DATABASE__PORT |
5432 |
PostgreSQL port |
OPENBRAIN__DATABASE__NAME |
openbrain |
Database name |
OPENBRAIN__DATABASE__USER |
- | Database user |
OPENBRAIN__DATABASE__PASSWORD |
- | Database password |
OPENBRAIN__EMBEDDING__MODEL_PATH |
models/all-MiniLM-L6-v2 |
ONNX model path |
OPENBRAIN__AUTH__ENABLED |
false |
Enable API key auth |
License
MIT