mirror of
https://gitea.ingwaz.work/Ingwaz/openbrain-mcp.git
synced 2026-03-31 14:49:06 +00:00
340 lines
14 KiB
Markdown
340 lines
14 KiB
Markdown
# OpenBrain MCP Server
|
|
|
|
**High-performance vector memory for AI agents**
|
|
|
|
OpenBrain is a Model Context Protocol (MCP) server that provides AI agents with a persistent, semantic memory system. It uses local ONNX-based embeddings and PostgreSQL with pgvector for efficient similarity search.
|
|
|
|
## Features
|
|
|
|
- 🧠 **Semantic Memory**: Store and retrieve memories using vector similarity search
|
|
- 🏠 **Local Embeddings**: No external API calls - uses ONNX runtime with all-MiniLM-L6-v2
|
|
- 🐘 **PostgreSQL + pgvector**: Production-grade vector storage with HNSW indexing
|
|
- 🔌 **MCP Protocol**: Streamable HTTP plus legacy HTTP+SSE compatibility
|
|
- 🔐 **Multi-Agent Support**: Isolated memory namespaces per agent
|
|
- ♻️ **Deduplicated Ingest**: Near-duplicate facts are merged instead of stored repeatedly
|
|
- ⚡ **High Performance**: Rust implementation with async I/O
|
|
|
|
## MCP Tools
|
|
|
|
| Tool | Description |
|
|
|------|-------------|
|
|
| `store` | Store a memory with automatic embedding generation, optional TTL, and automatic deduplication |
|
|
| `batch_store` | Store 1-50 memories atomically in a single call with the same deduplication rules |
|
|
| `query` | Search memories by semantic similarity |
|
|
| `purge` | Delete memories by agent ID or time range |
|
|
|
|
## Quick Start
|
|
|
|
### Prerequisites
|
|
|
|
- Rust 1.75+
|
|
- PostgreSQL 14+ with pgvector extension
|
|
- ONNX model files (all-MiniLM-L6-v2)
|
|
|
|
### Database Setup
|
|
|
|
```sql
|
|
CREATE ROLE openbrain_svc LOGIN PASSWORD 'change-me';
|
|
CREATE DATABASE openbrain OWNER openbrain_svc;
|
|
\c openbrain
|
|
CREATE EXTENSION IF NOT EXISTS vector;
|
|
```
|
|
|
|
Use the same PostgreSQL role for the app and for migrations. Do not create the
|
|
`memories` table manually as `postgres` or another owner and then run
|
|
OpenBrain as `openbrain_svc`, because later `ALTER TABLE` migrations will fail
|
|
with `must be owner of table memories`.
|
|
|
|
### Configuration
|
|
|
|
```bash
|
|
cp .env.example .env
|
|
# Edit .env with your database credentials
|
|
```
|
|
|
|
### Build & Run
|
|
|
|
```bash
|
|
cargo build --release
|
|
./target/release/openbrain-mcp migrate
|
|
./target/release/openbrain-mcp
|
|
```
|
|
|
|
### Database Migrations
|
|
|
|
This project uses `refinery` with embedded SQL migrations in `migrations/`.
|
|
|
|
Run pending migrations explicitly before starting or restarting the service:
|
|
|
|
```bash
|
|
./target/release/openbrain-mcp migrate
|
|
```
|
|
|
|
If you use the deploy script or CI workflow in `.gitea/deploy.sh` and `.gitea/workflows/ci-cd.yaml`, they already run this for you.
|
|
|
|
### E2E Test Modes
|
|
|
|
The end-to-end test suite supports two modes:
|
|
|
|
- Local mode: default. Assumes the test process can manage schema setup against a local PostgreSQL instance and, for one auth-only test, spawn a local `openbrain-mcp` child process.
|
|
- Remote mode: set `OPENBRAIN_E2E_REMOTE=true` and point `OPENBRAIN_E2E_BASE_URL` at a deployed server such as `http://your-server.example.com:3100` or `https://memory.example.com`. In this mode the suite does not try to create schema locally and skips the local process auth smoke test.
|
|
|
|
Recommended env for VPS-backed runs:
|
|
|
|
```bash
|
|
OPENBRAIN_E2E_REMOTE=true
|
|
OPENBRAIN_E2E_BASE_URL=https://memory.example.com
|
|
OPENBRAIN__AUTH__ENABLED=true
|
|
```
|
|
|
|
### TTL / Expiry
|
|
|
|
Transient facts can be stored with an optional `ttl` string on `store`, or on
|
|
either the batch itself or individual entries for `batch_store`.
|
|
|
|
Supported units:
|
|
|
|
- `s` seconds
|
|
- `m` minutes
|
|
- `h` hours
|
|
- `d` days
|
|
- `w` weeks
|
|
|
|
Examples:
|
|
|
|
- `30s`
|
|
- `15m`
|
|
- `1h`
|
|
- `7d`
|
|
|
|
Expired memories are filtered from `query` immediately, even before the
|
|
background cleanup loop deletes them physically. The cleanup interval is
|
|
configured with `OPENBRAIN__TTL__CLEANUP_INTERVAL_SECONDS` and defaults to 300.
|
|
|
|
The CI workflow uses this remote mode after `main` deploys so e2e coverage validates the VPS deployment rather than the local runner host. It now generates a random per-run e2e key, temporarily appends it to the deployed `OPENBRAIN__AUTH__API_KEYS`, runs the suite, then removes the key and restarts the service.
|
|
|
|
For live deployments, keep `OPENBRAIN__AUTH__API_KEYS` for persistent non-test access only. The server accepts a comma-separated key list, so a practical split is:
|
|
|
|
- `prod_live_key` for normal agent traffic
|
|
- `smoke_test_key` for ad hoc diagnostics
|
|
|
|
In Gitea Actions, that means:
|
|
|
|
- repo secret `OPENBRAIN__AUTH__API_KEYS=prod_live_key,smoke_test_key`
|
|
|
|
If you want prod e2e coverage without leaving a standing CI key on the server, the workflow-generated ephemeral key handles that automatically.
|
|
|
|
### Deduplication on Ingest
|
|
|
|
OpenBrain checks every `store` and `batch_store` write for an existing memory in
|
|
the same `agent_id` namespace whose vector similarity meets the configured
|
|
dedup threshold.
|
|
|
|
Default behavior:
|
|
|
|
- deduplication is always on
|
|
- only same-agent memories are considered
|
|
- expired memories are ignored
|
|
- if a duplicate is found, the existing memory is refreshed instead of inserting a new row
|
|
- metadata is merged with new keys overriding old values
|
|
- `created_at` is updated to `now()`
|
|
- `expires_at` is preserved unless the new write supplies a fresh TTL
|
|
|
|
Configure the threshold with either:
|
|
|
|
- `OPENBRAIN__DEDUP__THRESHOLD=0.90`
|
|
- `DEDUP_THRESHOLD=0.90`
|
|
|
|
Tool responses expose whether a write deduplicated an existing row via the
|
|
`deduplicated` flag. `batch_store` also returns a `status` of either
|
|
`stored` or `deduplicated` per entry.
|
|
|
|
## Agent Zero Developer Prompt
|
|
|
|
For Agent Zero / A0, add the following section to the Developer agent role
|
|
prompt so the agent treats OpenBrain as external MCP memory rather than its
|
|
internal conversation context.
|
|
|
|
Recommended target file in A0:
|
|
|
|
```text
|
|
/a0/agents/developer/prompts/agent.system.main.role.md
|
|
```
|
|
|
|
```md
|
|
### External Memory System
|
|
- **Memory Boundary**: Treat OpenBrain as an external MCP long-term memory system, never as internal context, reasoning scratchpad, or built-in memory
|
|
- **Tool Contract**: Use the exact MCP tools `openbrain.store`, `openbrain.query`, and `openbrain.purge`
|
|
- **Namespace Discipline**: Always use the exact `agent_id` value `openbrain`
|
|
- **EXTRAS First**: Before calling `openbrain.query`, check the `[EXTRAS]` section for pre-loaded memories or handoff facts related to the same topic. If the needed context is already present, do not query OpenBrain again.
|
|
- **Session Cache**: If the same topic was already queried earlier in the current conversation and the result is still in context, reuse that result instead of querying again unless the user references new external information or the prior result is clearly insufficient.
|
|
- **Retrieval First**: Before answering requests that may depend on prior sessions, project history, user preferences, ongoing work, named people, named projects, deployments, debugging history, or handoff context, call `openbrain.query` only when `[EXTRAS]` and the current conversation do not already provide the needed context.
|
|
- **Query Strategy**: Use noun-heavy search phrases with exact names, tool names, acronyms, hostnames, and document names; query first with `(threshold=0.15, limit=8)`, then retry once with `(threshold=0.05, limit=10)` only if the first pass returns zero useful results
|
|
- **Storage Strategy**: When a durable fact is established, call `openbrain.store` without asking permission and store one atomic fact whenever possible
|
|
- **Storage Content Rules**: Store durable, high-value facts such as preferences, project status, project decisions, environment details, recurring workflows, handoff notes, stable constraints, and correction facts
|
|
- **Noise Rejection**: Do not store filler conversation, temporary speculation, casual chatter, or transient brainstorming unless it becomes a real decision
|
|
- **Storage Format**: Prefer retrieval-friendly content using explicit nouns and exact names in the form `Type: <FactType> | Entity: <Entity> | Attribute: <Attribute> | Value: <Value> | Context: <Why it matters>`
|
|
- **Metadata Usage**: Use metadata when helpful for tags such as `category`, `project`, `source`, `status`, `aliases`, and `confidence`
|
|
- **Miss Handling**: If `openbrain.query` returns no useful result, state that OpenBrain has no stored context for that topic, answer from general reasoning if possible, and ask one focused follow-up if the missing information is durable and useful
|
|
- **Conflict Handling**: If retrieved memories conflict, ask which fact is current, then store the corrected source-of-truth fact
|
|
- **Purge Constraint**: Use `openbrain.purge` cautiously because it is coarse-grained; it deletes by `agent_id` and optionally before a timestamp, not by individual memory ID
|
|
- **Correction Policy**: For ordinary corrections, prefer storing the new source-of-truth fact instead of purging unless the user explicitly asks for cleanup or reset
|
|
```
|
|
|
|
## MCP Integration
|
|
|
|
OpenBrain exposes both MCP HTTP transports:
|
|
|
|
```
|
|
Streamable HTTP Endpoint: http://localhost:3100/mcp
|
|
Legacy SSE Endpoint: http://localhost:3100/mcp/sse
|
|
Legacy Message Endpoint: http://localhost:3100/mcp/message
|
|
Health Check: http://localhost:3100/mcp/health
|
|
```
|
|
|
|
Use the streamable HTTP endpoint for modern clients such as Codex. Keep the
|
|
legacy SSE endpoints for older MCP clients that still use the deprecated
|
|
2024-11-05 HTTP+SSE transport.
|
|
|
|
Header roles:
|
|
- `X-Agent-ID` is the memory namespace. Keep this stable if multiple clients
|
|
should share the same OpenBrain memories.
|
|
- `X-Agent-Type` is an optional client profile label for logging and config
|
|
clarity, such as `agent-zero` or `codex`.
|
|
|
|
### Example: Codex Configuration
|
|
|
|
```toml
|
|
[mcp_servers.openbrain]
|
|
url = "https://memory.example.com/mcp"
|
|
http_headers = { "X-API-Key" = "YOUR_OPENBRAIN_API_KEY", "X-Agent-ID" = "openbrain", "X-Agent-Type" = "codex" }
|
|
```
|
|
|
|
### Example: Agent Zero Configuration
|
|
|
|
```json
|
|
{
|
|
"mcpServers": {
|
|
"openbrain": {
|
|
"url": "https://memory.example.com/mcp/sse",
|
|
"headers": {
|
|
"X-API-Key": "YOUR_OPENBRAIN_API_KEY",
|
|
"X-Agent-ID": "openbrain",
|
|
"X-Agent-Type": "agent-zero"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
Agent Zero should keep using the legacy HTTP+SSE transport unless and until its
|
|
client runtime supports streamable HTTP. Codex should use `/mcp`.
|
|
|
|
### Example: Store a Memory
|
|
|
|
```json
|
|
{
|
|
"jsonrpc": "2.0",
|
|
"id": 1,
|
|
"method": "tools/call",
|
|
"params": {
|
|
"name": "store",
|
|
"arguments": {
|
|
"content": "The user prefers dark mode and uses vim keybindings",
|
|
"agent_id": "assistant-1",
|
|
"ttl": "7d",
|
|
"metadata": {"source": "preferences"}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Example: Query Memories
|
|
|
|
```json
|
|
{
|
|
"jsonrpc": "2.0",
|
|
"id": 2,
|
|
"method": "tools/call",
|
|
"params": {
|
|
"name": "query",
|
|
"arguments": {
|
|
"query": "What are the user's editor preferences?",
|
|
"agent_id": "assistant-1",
|
|
"limit": 5,
|
|
"threshold": 0.6
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Example: Batch Store Memories
|
|
|
|
```json
|
|
{
|
|
"jsonrpc": "2.0",
|
|
"id": 3,
|
|
"method": "tools/call",
|
|
"params": {
|
|
"name": "batch_store",
|
|
"arguments": {
|
|
"agent_id": "assistant-1",
|
|
"entries": [
|
|
{
|
|
"content": "The user prefers dark mode",
|
|
"ttl": "24h",
|
|
"metadata": {"category": "preference"}
|
|
},
|
|
{
|
|
"content": "The user uses vim keybindings",
|
|
"metadata": {"category": "preference"}
|
|
}
|
|
]
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
## Architecture
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────┐
|
|
│ AI Agent │
|
|
└─────────────────────┬───────────────────────────────────┘
|
|
│ MCP Protocol (Streamable HTTP / Legacy SSE)
|
|
┌─────────────────────▼───────────────────────────────────┐
|
|
│ OpenBrain MCP Server │
|
|
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
|
|
│ │ store │ │ query │ │ purge │ │
|
|
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
|
|
│ │ │ │ │
|
|
│ ┌──────▼────────────────▼────────────────▼──────┐ │
|
|
│ │ Embedding Engine (ONNX) │ │
|
|
│ │ all-MiniLM-L6-v2 (384d) │ │
|
|
│ └──────────────────────┬────────────────────────┘ │
|
|
│ │ │
|
|
│ ┌──────────────────────▼────────────────────────┐ │
|
|
│ │ PostgreSQL + pgvector │ │
|
|
│ │ HNSW Index for fast search │ │
|
|
│ └────────────────────────────────────────────────┘ │
|
|
└─────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Environment Variables
|
|
|
|
| Variable | Default | Description |
|
|
|----------|---------|-------------|
|
|
| `OPENBRAIN__SERVER__HOST` | `0.0.0.0` | Server bind address |
|
|
| `OPENBRAIN__SERVER__PORT` | `3100` | Server port |
|
|
| `OPENBRAIN__DATABASE__HOST` | `localhost` | PostgreSQL host |
|
|
| `OPENBRAIN__DATABASE__PORT` | `5432` | PostgreSQL port |
|
|
| `OPENBRAIN__DATABASE__NAME` | `openbrain` | Database name |
|
|
| `OPENBRAIN__DATABASE__USER` | - | Database user |
|
|
| `OPENBRAIN__DATABASE__PASSWORD` | - | Database password |
|
|
| `OPENBRAIN__EMBEDDING__MODEL_PATH` | `models/all-MiniLM-L6-v2` | ONNX model path |
|
|
| `OPENBRAIN__AUTH__ENABLED` | `false` | Enable API key auth |
|
|
|
|
## License
|
|
|
|
MIT
|