mirror of
https://gitea.ingwaz.work/Ingwaz/openbrain-mcp.git
synced 2026-06-15 22:07:08 +00:00
642 lines
28 KiB
Markdown
642 lines
28 KiB
Markdown
# OpenBrain MCP Server
|
||
|
||
**High-performance vector memory for AI agents**
|
||
|
||
OpenBrain is a Model Context Protocol (MCP) server that provides AI agents with a persistent, semantic memory system. It uses local ONNX-based embeddings and PostgreSQL with pgvector for efficient similarity search.
|
||
|
||
## Features
|
||
|
||
- 🧠 **Semantic Memory**: Store and retrieve memories using vector similarity search
|
||
- 🏠 **Local Embeddings**: No external API calls - uses ONNX runtime with all-MiniLM-L6-v2
|
||
- 🐘 **PostgreSQL + pgvector**: Production-grade vector storage with HNSW indexing
|
||
- 🔌 **MCP Protocol**: Streamable HTTP plus legacy HTTP+SSE compatibility
|
||
- 🔐 **Shared Memory by Token**: Agents using the same API token share memory visibility while retaining source-agent provenance
|
||
- ♻️ **Deduplicated Ingest**: Near-duplicate facts are merged instead of stored repeatedly
|
||
- ⚡ **High Performance**: Rust implementation with async I/O
|
||
- 🔍 **Truth Engine**: Optional neuro-symbolic truth scoring using PLN deduction and ECAN attention economics (based on Bushidai Truth Simulator by TS87)
|
||
|
||
## MCP Tools
|
||
|
||
| Tool | Description |
|
||
|------|-------------|
|
||
| `store` | Store a memory with automatic embedding generation, optional TTL, and automatic deduplication |
|
||
| `batch_store` | Store 1-50 memories atomically in a single call with the same deduplication rules |
|
||
| `query` | Search memories by semantic similarity, optionally filtering by source agent |
|
||
| `purge` | Delete memories visible to the current API token, optionally filtering by source agent or time range |
|
||
| `evaluate` | On-demand truth scoring of a claim using neuro-symbolic reasoning (requires Truth Engine) |
|
||
| `truth_status` | Get Truth Engine scoring statistics and coverage metrics (requires Truth Engine) |
|
||
|
||
## Quick Start
|
||
|
||
### Prerequisites
|
||
|
||
- Rust 1.75+
|
||
- PostgreSQL 14+ with pgvector extension
|
||
- ONNX model files (all-MiniLM-L6-v2)
|
||
|
||
### Database Setup
|
||
|
||
```sql
|
||
CREATE ROLE openbrain_svc LOGIN PASSWORD 'change-me';
|
||
CREATE DATABASE openbrain OWNER openbrain_svc;
|
||
\c openbrain
|
||
CREATE EXTENSION IF NOT EXISTS vector;
|
||
```
|
||
|
||
Use the same PostgreSQL role for the app and for migrations. Do not create the
|
||
`memories` table manually as `postgres` or another owner and then run
|
||
OpenBrain as `openbrain_svc`, because later `ALTER TABLE` migrations will fail
|
||
with `must be owner of table memories`.
|
||
|
||
### Configuration
|
||
|
||
```bash
|
||
cp .env.example .env
|
||
# Edit .env with your database credentials
|
||
```
|
||
|
||
### Build & Run
|
||
|
||
```bash
|
||
cargo build --release
|
||
./target/release/openbrain-mcp migrate
|
||
./target/release/openbrain-mcp
|
||
```
|
||
|
||
### Database Migrations
|
||
|
||
This project uses `refinery` with embedded SQL migrations in `migrations/`.
|
||
|
||
Run pending migrations explicitly before starting or restarting the service:
|
||
|
||
```bash
|
||
./target/release/openbrain-mcp migrate
|
||
```
|
||
|
||
If you use the deploy script or CI workflow in `.gitea/deploy.sh` and `.gitea/workflows/ci-cd.yaml`, they already run this for you.
|
||
|
||
### E2E Test Modes
|
||
|
||
The end-to-end test suite supports two modes:
|
||
|
||
- Local mode: default. Assumes the test process can manage schema setup against a local PostgreSQL instance and, for one auth-only test, spawn a local `openbrain-mcp` child process.
|
||
- Remote mode: set `OPENBRAIN_E2E_REMOTE=true` and point `OPENBRAIN_E2E_BASE_URL` at a deployed server such as `http://your-server.example.com:3100` or `https://memory.example.com`. In this mode the suite does not try to create schema locally and skips the local process auth smoke test.
|
||
|
||
Recommended env for VPS-backed runs:
|
||
|
||
```bash
|
||
OPENBRAIN_E2E_REMOTE=true
|
||
OPENBRAIN_E2E_BASE_URL=https://memory.example.com
|
||
OPENBRAIN__AUTH__ENABLED=true
|
||
```
|
||
|
||
### TTL / Expiry
|
||
|
||
Transient facts can be stored with an optional `ttl` string on `store`, or on
|
||
either the batch itself or individual entries for `batch_store`.
|
||
|
||
Supported units:
|
||
|
||
- `s` seconds
|
||
- `m` minutes
|
||
- `h` hours
|
||
- `d` days
|
||
- `w` weeks
|
||
|
||
Examples:
|
||
|
||
- `30s`
|
||
- `15m`
|
||
- `1h`
|
||
- `7d`
|
||
|
||
Expired memories are filtered from `query` immediately, even before the
|
||
background cleanup loop deletes them physically. The cleanup interval is
|
||
configured with `OPENBRAIN__TTL__CLEANUP_INTERVAL_SECONDS` and defaults to 300.
|
||
|
||
The CI workflow uses this remote mode after `main` deploys so e2e coverage validates the VPS deployment rather than the local runner host. It now generates a random per-run e2e key, temporarily appends it to the deployed `OPENBRAIN__AUTH__API_KEYS`, runs the suite, then removes the key and restarts the service.
|
||
|
||
For live deployments, keep `OPENBRAIN__AUTH__API_KEYS` for persistent non-test access only. The server accepts a comma-separated key list, so a practical split is:
|
||
|
||
- `prod_live_key` for normal agent traffic
|
||
- `smoke_test_key` for ad hoc diagnostics
|
||
|
||
In Gitea Actions, that means:
|
||
|
||
- repo secret `OPENBRAIN__AUTH__API_KEYS=prod_live_key,smoke_test_key`
|
||
|
||
If you want prod e2e coverage without leaving a standing CI key on the server, the workflow-generated ephemeral key handles that automatically.
|
||
|
||
### Deduplication on Ingest
|
||
|
||
OpenBrain checks every `store` and `batch_store` write for an existing memory in
|
||
the same API-token scope and same source `agent_id` whose vector similarity
|
||
meets the configured dedup threshold.
|
||
|
||
Default behavior:
|
||
|
||
- deduplication is always on
|
||
- only same-agent memories are considered
|
||
- expired memories are ignored
|
||
- if a duplicate is found, the existing memory is refreshed instead of inserting a new row
|
||
- metadata is merged with new keys overriding old values
|
||
- `created_at` is updated to `now()`
|
||
- `expires_at` is preserved unless the new write supplies a fresh TTL
|
||
|
||
Configure the threshold with either:
|
||
|
||
- `OPENBRAIN__DEDUP__THRESHOLD=0.90`
|
||
- `DEDUP_THRESHOLD=0.90`
|
||
|
||
Tool responses expose whether a write deduplicated an existing row via the
|
||
`deduplicated` flag. `batch_store` also returns a `status` of either
|
||
`stored` or `deduplicated` per entry.
|
||
|
||
## Agent Zero Developer Prompt
|
||
|
||
For Agent Zero / A0, add the following section to the Developer agent role
|
||
prompt so the agent treats OpenBrain as external MCP memory rather than its
|
||
internal conversation context.
|
||
|
||
Recommended target file in A0:
|
||
|
||
```text
|
||
/a0/agents/developer/prompts/agent.system.main.role.md
|
||
```
|
||
|
||
```md
|
||
### External Memory System
|
||
- **Memory Boundary**: Treat OpenBrain as an external MCP long-term memory system, never as internal context, reasoning scratchpad, or built-in memory
|
||
- **Tool Contract**: Use the exact MCP tools `openbrain.store`, `openbrain.query`, and `openbrain.purge`
|
||
- **Shared Access Model**: Memory visibility is determined by the API token in the MCP client config, not by `agent_id`
|
||
- **Source Labels**: Use `agent_id` on `openbrain.store` and `openbrain.batch_store` only as a provenance label for the storing agent when that label is useful
|
||
- **EXTRAS First**: Before calling `openbrain.query`, check the `[EXTRAS]` section for pre-loaded memories or handoff facts related to the same topic. If the needed context is already present, do not query OpenBrain again.
|
||
- **Session Cache**: If the same topic was already queried earlier in the current conversation and the result is still in context, reuse that result instead of querying again unless the user references new external information or the prior result is clearly insufficient.
|
||
- **Retrieval First**: Before answering requests that may depend on prior sessions, project history, user preferences, ongoing work, named people, named projects, deployments, debugging history, or handoff context, call `openbrain.query` only when `[EXTRAS]` and the current conversation do not already provide the needed context.
|
||
- **Query Scope**: Do not send `agent_id` with `openbrain.query` for normal retrieval. Use `source_agent_id` only when you intentionally want to filter results by the agent that originally stored them.
|
||
- **Query Strategy**: Use noun-heavy search phrases with exact names, tool names, acronyms, hostnames, and document names; query first with `(threshold=0.15, limit=8)`, then retry once with `(threshold=0.05, limit=10)` only if the first pass returns zero useful results
|
||
- **Storage Strategy**: When a durable fact is established, call `openbrain.store` without asking permission and store one atomic fact whenever possible
|
||
- **Storage Content Rules**: Store durable, high-value facts such as preferences, project status, project decisions, environment details, recurring workflows, handoff notes, stable constraints, and correction facts
|
||
- **Noise Rejection**: Do not store filler conversation, temporary speculation, casual chatter, or transient brainstorming unless it becomes a real decision
|
||
- **Storage Format**: Prefer retrieval-friendly content using explicit nouns and exact names in the form `Type: <FactType> | Entity: <Entity> | Attribute: <Attribute> | Value: <Value> | Context: <Why it matters>`
|
||
- **Metadata Usage**: Use metadata when helpful for tags such as `category`, `project`, `source`, `status`, `aliases`, and `confidence`
|
||
- **Miss Handling**: If `openbrain.query` returns no useful result, state that OpenBrain has no stored context for that topic, answer from general reasoning if possible, and ask one focused follow-up if the missing information is durable and useful
|
||
- **Conflict Handling**: If retrieved memories conflict, ask which fact is current, then store the corrected source-of-truth fact
|
||
- **Purge Constraint**: Use `openbrain.purge` cautiously because it is coarse-grained; it deletes memories visible to the current API token and can optionally narrow by `source_agent_id` and `before`, but not by individual memory ID
|
||
- **Correction Policy**: For ordinary corrections, prefer storing the new source-of-truth fact instead of purging unless the user explicitly asks for cleanup or reset
|
||
- **Source Tagging**: Every `openbrain.store` call MUST include `"source_agent"` in metadata, set to the Agent Instance ID defined in the active project's identity file (e.g., `"source_agent": "ingwaz-a0"`). This enables tracing facts back to the originating agent instance.
|
||
```
|
||
|
||
## Agent Identity & Source Tagging
|
||
|
||
When multiple agent instances interact with the same OpenBrain server, every stored
|
||
fact must be traceable to the agent that created it. This is achieved through a
|
||
required `source_agent` metadata field.
|
||
|
||
### Setting Up Agent Identity
|
||
|
||
1. **Choose a unique agent name** — a short, human-readable identifier for your
|
||
agent instance (e.g., `ingwaz-a0`, `prod-deploy-bot`, `research-agent-west`).
|
||
|
||
2. **Persist the identity** — Create an identity file that is automatically
|
||
injected into the agent's system prompt. For Agent Zero, use a
|
||
`.promptinclude.md` file in the active project's working directory:
|
||
|
||
```markdown
|
||
# Agent Identity
|
||
- **Agent Instance ID**: `your-agent-name`
|
||
- **Platform**: Agent Zero 1.3 on Pop!_OS Linux
|
||
- When storing facts to OpenBrain (`openbrain.store`), always include
|
||
`"source_agent": "your-agent-name"` in metadata
|
||
```
|
||
|
||
For other frameworks, use whatever mechanism injects persistent context into
|
||
every conversation (environment variables, system prompt includes, etc.).
|
||
|
||
3. **Add a prompt directive** — In the agent's role prompt, add:
|
||
> Every `openbrain.store` call MUST include `"source_agent"` in metadata,
|
||
> set to the Agent Instance ID defined in the identity file.
|
||
|
||
### Source Tagging Rules
|
||
|
||
- Every `openbrain.store` call **MUST** include `"source_agent": "<agent-id>"` in metadata
|
||
- The `source_agent` value must match the Agent Instance ID exactly
|
||
- The `agent_id` parameter (`openbrain`) is the shared namespace; `source_agent` in metadata identifies *who* stored the fact
|
||
- When querying, use `source_agent` metadata to filter or attribute facts if needed
|
||
|
||
### Example
|
||
|
||
```json
|
||
{
|
||
"agent_id": "openbrain",
|
||
"content": "Type: Decision | Entity: Some API | Attribute: auth-method | Value: JWT with refresh tokens | Context: Chosen over session cookies for mobile app support",
|
||
"metadata": {
|
||
"source_agent": "some-a0",
|
||
"category": "project-decision",
|
||
"project": "someproject",
|
||
"status": "active"
|
||
}
|
||
}
|
||
```
|
||
|
||
### Multi-Agent Scenarios
|
||
|
||
| Scenario | Guidance |
|
||
|---|---|
|
||
| Single agent instance | Still tag `source_agent` for future-proofing |
|
||
| Multiple agents, same project | Each agent uses its own unique `source_agent` value |
|
||
| Agent replacement / migration | New agent gets a new name; old facts remain attributed to the old agent |
|
||
| Subordinate agents | Inherit the parent agent's `source_agent` value unless independently identified |
|
||
|
||
## MCP Integration
|
||
|
||
OpenBrain exposes both MCP HTTP transports:
|
||
|
||
```
|
||
Streamable HTTP Endpoint: http://localhost:3100/mcp
|
||
Legacy SSE Endpoint: http://localhost:3100/mcp/sse
|
||
Legacy Message Endpoint: http://localhost:3100/mcp/message
|
||
Health Check: http://localhost:3100/mcp/health
|
||
```
|
||
|
||
Use the streamable HTTP endpoint for modern clients such as Codex. Keep the
|
||
legacy SSE endpoints for older MCP clients that still use the deprecated
|
||
2024-11-05 HTTP+SSE transport.
|
||
|
||
Header roles:
|
||
- If two clients use the same API token, they can read and write the same
|
||
OpenBrain memories.
|
||
- `X-Agent-ID` is an optional source-agent label for logs and store provenance.
|
||
It does not control memory visibility.
|
||
- `X-Agent-Type` is an optional client-program label such as `agent-zero`,
|
||
`codex`, or `claude-code`. It does not select transport server-side; the URL
|
||
path does that.
|
||
- `agent_id` on `store` and `batch_store` is provenance. `source_agent_id` on
|
||
`query` and `purge` is an optional provenance filter.
|
||
|
||
### Example: Codex Configuration
|
||
|
||
```toml
|
||
[mcp_servers.openbrain]
|
||
url = "https://memory.example.com/mcp"
|
||
http_headers = { "X-API-Key" = "YOUR_OPENBRAIN_API_KEY", "X-Agent-ID" = "codex-desktop", "X-Agent-Type" = "codex" }
|
||
```
|
||
|
||
### Example: Agent Zero Configuration
|
||
|
||
```json
|
||
{
|
||
"mcpServers": {
|
||
"openbrain": {
|
||
"url": "https://memory.example.com/mcp/sse",
|
||
"headers": {
|
||
"X-API-Key": "YOUR_OPENBRAIN_API_KEY",
|
||
"X-Agent-ID": "agent-zero",
|
||
"X-Agent-Type": "agent-zero"
|
||
}
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
Agent Zero should keep using the legacy HTTP+SSE transport unless and until its
|
||
client runtime supports streamable HTTP. Codex should use `/mcp`. If both
|
||
clients use the same API token, they already share memory visibility.
|
||
|
||
### Example: Store a Memory
|
||
|
||
```json
|
||
{
|
||
"jsonrpc": "2.0",
|
||
"id": 1,
|
||
"method": "tools/call",
|
||
"params": {
|
||
"name": "store",
|
||
"arguments": {
|
||
"content": "The user prefers dark mode and uses vim keybindings",
|
||
"agent_id": "agent-zero",
|
||
"ttl": "7d",
|
||
"metadata": {"source": "preferences"}
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
### Example: Query Memories
|
||
|
||
```json
|
||
{
|
||
"jsonrpc": "2.0",
|
||
"id": 2,
|
||
"method": "tools/call",
|
||
"params": {
|
||
"name": "query",
|
||
"arguments": {
|
||
"query": "What are the user's editor preferences?",
|
||
"limit": 5,
|
||
"threshold": 0.6
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
### Example: Batch Store Memories
|
||
|
||
```json
|
||
{
|
||
"jsonrpc": "2.0",
|
||
"id": 3,
|
||
"method": "tools/call",
|
||
"params": {
|
||
"name": "batch_store",
|
||
"arguments": {
|
||
"agent_id": "codex",
|
||
"entries": [
|
||
{
|
||
"content": "The user prefers dark mode",
|
||
"ttl": "24h",
|
||
"metadata": {"category": "preference"}
|
||
},
|
||
{
|
||
"content": "The user uses vim keybindings",
|
||
"metadata": {"category": "preference"}
|
||
}
|
||
]
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
## Architecture
|
||
|
||
```
|
||
┌─────────────────────────────────────────────────────────────┐
|
||
│ AI Agent │
|
||
└─────────────────────┬───────────────────────────────────────┘
|
||
│ MCP Protocol (Streamable HTTP / Legacy SSE)
|
||
┌─────────────────────┴───────────────────────────────────────┐
|
||
│ OpenBrain MCP Server │
|
||
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
|
||
│ │ store │ │ query │ │ purge │ │
|
||
│ └──────┬──────────┘ └──────┬──────────┘ └──────┬──────────┘ │
|
||
│ │ │ │ │
|
||
│ ┌──────▼────────────────────▼────────────────────▼──────────┐ │
|
||
│ │ Embedding Engine (ONNX) │ │
|
||
│ │ all-MiniLM-L6-v2 (384d) │ │
|
||
│ └──────────────────────┬────────────────────────────────────┘ │
|
||
│ │ │
|
||
│ ┌──────────────────────▼────────────────────────────────────┐ │
|
||
│ │ PostgreSQL + pgvector │ │
|
||
│ │ HNSW Index for fast search │ │
|
||
│ └───────────────────────────────────────────────────────────┘ │
|
||
└─────────────────────────────────────────────────────────────────────┘
|
||
```
|
||
|
||
## Environment Variables
|
||
|
||
| Variable | Default | Description |
|
||
|----------|---------|-------------|
|
||
| `OPENBRAIN__SERVER__HOST` | `0.0.0.0` | Server bind address |
|
||
| `OPENBRAIN__SERVER__PORT` | `3100` | Server port |
|
||
| `OPENBRAIN__DATABASE__HOST` | `localhost` | PostgreSQL host |
|
||
| `OPENBRAIN__DATABASE__PORT` | `5432` | PostgreSQL port |
|
||
| `OPENBRAIN__DATABASE__NAME` | `openbrain` | Database name |
|
||
| `OPENBRAIN__DATABASE__USER` | - | Database user |
|
||
| `OPENBRAIN__DATABASE__PASSWORD` | - | Database password |
|
||
| `OPENBRAIN__EMBEDDING__MODEL_PATH` | `models/all-MiniLM-L6-v2` | ONNX model path |
|
||
| `OPENBRAIN__AUTH__ENABLED` | `false` | Enable API key auth |
|
||
|
||
## Truth Engine
|
||
|
||
### Overview
|
||
|
||
The Truth Engine is an integrated neuro-symbolic reasoning system that continuously scores stored memories for truthfulness and reliability. Based on the **Bushidai Truth Simulator** by Thijs Smits (TS87), it runs as a background worker within the OpenBrain process, evaluating memories against each other using cross-referencing, deductive logic, and attention economics.
|
||
|
||
Core components:
|
||
|
||
- **PLN (Probabilistic Logic Networks)**: Deductive reasoning engine that evaluates logical consistency between memories, computing truth values and confidence scores based on evidence overlap and inferential strength.
|
||
- **ECAN (Economic Attention Network)**: Attention economy module that manages importance and relevance of memories over time, applying decay rates and spreading activation to maintain a dynamic salience map.
|
||
- **Cross-Reference Engine**: Finds semantically related memories using vector similarity and evaluates corroboration or contradiction patterns across the memory corpus.
|
||
|
||
The Truth Engine is entirely optional and disabled by default. When enabled, it enriches every stored memory with truth metadata without affecting query latency.
|
||
|
||
### Truth Engine Configuration
|
||
|
||
All Truth Engine settings use the `OPENBRAIN__TRUTH__*` environment variable prefix:
|
||
|
||
| Variable | Type | Default | Description |
|
||
|----------|------|---------|-------------|
|
||
| `OPENBRAIN__TRUTH__ENABLED` | bool | `false` | Enable or disable the Truth Engine background worker and tools |
|
||
| `OPENBRAIN__TRUTH__SCORING_INTERVAL_SECONDS` | u64 | `300` | How often (in seconds) the background worker runs a scoring cycle |
|
||
| `OPENBRAIN__TRUTH__BATCH_SIZE` | usize | `50` | Maximum number of unscored memories to process per cycle |
|
||
| `OPENBRAIN__TRUTH__RESCORE_AFTER_SECONDS` | u64 | `86400` | Re-score memories older than this threshold (seconds) to keep truth values current |
|
||
| `OPENBRAIN__TRUTH__PLN_BASE_CONFIDENCE` | f32 | `0.85` | Base confidence for PLN deductive reasoning when no prior evidence exists |
|
||
| `OPENBRAIN__TRUTH__ECAN_DECAY_RATE` | f32 | `0.95` | ECAN attention decay rate per scoring cycle (closer to 1.0 = slower decay) |
|
||
| `OPENBRAIN__TRUTH__ECAN_SPREAD_FACTOR` | f32 | `0.3` | How much attention spreads to semantically related memories (0.0–1.0) |
|
||
| `OPENBRAIN__TRUTH__CONTRADICTION_THRESHOLD` | f32 | `0.85` | Similarity threshold above which conflicting memories are flagged as contradictions |
|
||
| `OPENBRAIN__TRUTH__VERIFICATION_THRESHOLD` | f32 | `0.7` | Minimum corroboration score required to classify a memory as "verified" |
|
||
| `OPENBRAIN__TRUTH__CROSS_REF_LIMIT` | usize | `10` | Maximum number of related memories to cross-reference per scored memory |
|
||
|
||
### New MCP Tools
|
||
|
||
#### `evaluate`
|
||
|
||
On-demand truth scoring of a claim. Use this to evaluate a specific statement against the existing memory corpus without waiting for the background worker cycle.
|
||
|
||
**Parameters:**
|
||
|
||
| Parameter | Type | Required | Description |
|
||
|-----------|------|----------|-------------|
|
||
| `claim` | string | Yes | The claim or statement to evaluate for truthfulness |
|
||
| `context` | string | No | Optional additional context to inform the evaluation |
|
||
|
||
**Returns:**
|
||
|
||
```json
|
||
{
|
||
"truth_value": 0.82,
|
||
"truth_confidence": 0.91,
|
||
"truth_category": "verified",
|
||
"reasoning": [
|
||
"Found 4 corroborating memories with high semantic similarity",
|
||
"PLN deduction yielded truth value 0.82 with confidence 0.91",
|
||
"No contradicting memories detected"
|
||
]
|
||
}
|
||
```
|
||
|
||
**Example MCP call:**
|
||
|
||
```json
|
||
{
|
||
"jsonrpc": "2.0",
|
||
"id": 1,
|
||
"method": "tools/call",
|
||
"params": {
|
||
"name": "evaluate",
|
||
"arguments": {
|
||
"claim": "The deployment uses JWT with refresh tokens for authentication",
|
||
"context": "HetLife API project"
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
#### `truth_status`
|
||
|
||
Get current Truth Engine scoring statistics and coverage metrics. Takes no parameters.
|
||
|
||
**Returns:**
|
||
|
||
```json
|
||
{
|
||
"enabled": true,
|
||
"total_memories": 1250,
|
||
"scored_memories": 1100,
|
||
"unscored_memories": 150,
|
||
"coverage": 0.88,
|
||
"categories": {
|
||
"verified": 420,
|
||
"plausible": 510,
|
||
"unverified": 150,
|
||
"contradicted": 20
|
||
},
|
||
"scoring_config": {
|
||
"interval_seconds": 300,
|
||
"batch_size": 50,
|
||
"rescore_after_seconds": 86400
|
||
}
|
||
}
|
||
```
|
||
|
||
**Example MCP call:**
|
||
|
||
```json
|
||
{
|
||
"jsonrpc": "2.0",
|
||
"id": 2,
|
||
"method": "tools/call",
|
||
"params": {
|
||
"name": "truth_status",
|
||
"arguments": {}
|
||
}
|
||
}
|
||
```
|
||
|
||
### Enhanced Query Results
|
||
|
||
When the Truth Engine is enabled, `query` results include additional truth metadata fields on each returned memory:
|
||
|
||
| Field | Type | Description |
|
||
|-------|------|-------------|
|
||
| `truth_value` | f32 \| null | Computed truth value (0.0–1.0), where 1.0 indicates highest truthfulness |
|
||
| `truth_confidence` | f32 \| null | Confidence in the truth value (0.0–1.0), reflecting the quality and quantity of evidence |
|
||
| `truth_category` | string \| null | Classification: `verified`, `plausible`, `unverified`, or `contradicted` |
|
||
|
||
These fields are `null` for memories that have not yet been scored by the background worker. Query performance is not affected — truth fields are read from pre-computed columns, not calculated at query time.
|
||
|
||
**Example query response with truth fields:**
|
||
|
||
```json
|
||
{
|
||
"content": "The API uses PostgreSQL 16 with pgvector extension",
|
||
"similarity": 0.89,
|
||
"truth_value": 0.94,
|
||
"truth_confidence": 0.87,
|
||
"truth_category": "verified",
|
||
"metadata": { "source_agent": "ingwaz-a0", "project": "openbrain" }
|
||
}
|
||
```
|
||
|
||
### How It Works
|
||
|
||
The Truth Engine operates as a non-blocking background worker loop:
|
||
|
||
```
|
||
┌──────────────────────────────────────────────────────────────┐
|
||
│ Truth Engine Worker Loop │
|
||
│ │
|
||
│ 1. Fetch unscored/stale memories (batch_size per cycle) │
|
||
│ │ │
|
||
│ 2. Cross-reference: find related memories via vector │
|
||
│ similarity (up to cross_ref_limit per memory) │
|
||
│ │ │
|
||
│ 3. PLN Score: deductive reasoning over evidence │
|
||
│ → truth_value + truth_confidence │
|
||
│ │ │
|
||
│ 4. ECAN Update: adjust attention/importance weights │
|
||
│ → spread activation to related memories │
|
||
│ │ │
|
||
│ 5. Categorize: assign truth_category based on thresholds │
|
||
│ │ │
|
||
│ 6. Write back: batch update truth scores to database │
|
||
│ │ │
|
||
│ 7. Sleep for scoring_interval_seconds │
|
||
└──────────────────────────────────────────────────────────────┘
|
||
```
|
||
|
||
**Truth Categories:**
|
||
|
||
| Category | Description |
|
||
|----------|-------------|
|
||
| `verified` | Corroborated by multiple related memories above the verification threshold |
|
||
| `plausible` | Some supporting evidence but below the verification threshold |
|
||
| `unverified` | Insufficient related memories to form a judgment |
|
||
| `contradicted` | Conflicts with one or more highly similar memories above the contradiction threshold |
|
||
|
||
**Key Design Properties:**
|
||
|
||
- **Non-blocking**: The worker runs in its own async task and never blocks query or store operations
|
||
- **Idempotent**: Re-scoring a memory converges to the same result given the same corpus state
|
||
- **Batch-oriented**: Processes memories in configurable batches to bound resource usage
|
||
- **Self-healing**: Stale scores are automatically refreshed after `rescore_after_seconds`
|
||
- **Graceful startup**: Waits for the embedding engine to be ready before starting the first cycle
|
||
|
||
### Truth Engine Quick Start
|
||
|
||
Enable truth scoring with minimal configuration:
|
||
|
||
```bash
|
||
# Enable truth scoring
|
||
export OPENBRAIN__TRUTH__ENABLED=true
|
||
export OPENBRAIN__TRUTH__SCORING_INTERVAL_SECONDS=300
|
||
export OPENBRAIN__TRUTH__BATCH_SIZE=50
|
||
```
|
||
|
||
For production tuning, consider adjusting these based on your memory corpus size:
|
||
|
||
```bash
|
||
# Production tuning example
|
||
export OPENBRAIN__TRUTH__ENABLED=true
|
||
export OPENBRAIN__TRUTH__SCORING_INTERVAL_SECONDS=120 # Score more frequently
|
||
export OPENBRAIN__TRUTH__BATCH_SIZE=100 # Larger batches
|
||
export OPENBRAIN__TRUTH__RESCORE_AFTER_SECONDS=43200 # Re-score every 12h
|
||
export OPENBRAIN__TRUTH__CROSS_REF_LIMIT=20 # More cross-references
|
||
export OPENBRAIN__TRUTH__VERIFICATION_THRESHOLD=0.75 # Stricter verification
|
||
```
|
||
|
||
Run the database migration to add truth columns, then start the server:
|
||
|
||
```bash
|
||
./target/release/openbrain-mcp migrate
|
||
./target/release/openbrain-mcp
|
||
```
|
||
|
||
Verify the Truth Engine is running:
|
||
|
||
```json
|
||
{
|
||
"jsonrpc": "2.0",
|
||
"id": 1,
|
||
"method": "tools/call",
|
||
"params": {
|
||
"name": "truth_status",
|
||
"arguments": {}
|
||
}
|
||
}
|
||
```
|
||
|
||
## License
|
||
|
||
This project is licensed under the [GNU General Public License v3.0](LICENSE).
|