mirror of
https://gitea.ingwaz.work/Ingwaz/openbrain-mcp.git
synced 2026-03-31 14:49:06 +00:00
Add server-side deduplication on ingest
This commit is contained in:
30
README.md
30
README.md
@@ -11,14 +11,15 @@ OpenBrain is a Model Context Protocol (MCP) server that provides AI agents with
|
||||
- 🐘 **PostgreSQL + pgvector**: Production-grade vector storage with HNSW indexing
|
||||
- 🔌 **MCP Protocol**: Streamable HTTP plus legacy HTTP+SSE compatibility
|
||||
- 🔐 **Multi-Agent Support**: Isolated memory namespaces per agent
|
||||
- ♻️ **Deduplicated Ingest**: Near-duplicate facts are merged instead of stored repeatedly
|
||||
- ⚡ **High Performance**: Rust implementation with async I/O
|
||||
|
||||
## MCP Tools
|
||||
|
||||
| Tool | Description |
|
||||
|------|-------------|
|
||||
| `store` | Store a memory with automatic embedding generation and optional TTL for transient facts |
|
||||
| `batch_store` | Store 1-50 memories atomically in a single call |
|
||||
| `store` | Store a memory with automatic embedding generation, optional TTL, and automatic deduplication |
|
||||
| `batch_store` | Store 1-50 memories atomically in a single call with the same deduplication rules |
|
||||
| `query` | Search memories by semantic similarity |
|
||||
| `purge` | Delete memories by agent ID or time range |
|
||||
|
||||
@@ -123,6 +124,31 @@ In Gitea Actions, that means:
|
||||
|
||||
If you want prod e2e coverage without leaving a standing CI key on the server, the workflow-generated ephemeral key handles that automatically.
|
||||
|
||||
### Deduplication on Ingest
|
||||
|
||||
OpenBrain checks every `store` and `batch_store` write for an existing memory in
|
||||
the same `agent_id` namespace whose vector similarity meets the configured
|
||||
dedup threshold.
|
||||
|
||||
Default behavior:
|
||||
|
||||
- deduplication is always on
|
||||
- only same-agent memories are considered
|
||||
- expired memories are ignored
|
||||
- if a duplicate is found, the existing memory is refreshed instead of inserting a new row
|
||||
- metadata is merged with new keys overriding old values
|
||||
- `created_at` is updated to `now()`
|
||||
- `expires_at` is preserved unless the new write supplies a fresh TTL
|
||||
|
||||
Configure the threshold with either:
|
||||
|
||||
- `OPENBRAIN__DEDUP__THRESHOLD=0.90`
|
||||
- `DEDUP_THRESHOLD=0.90`
|
||||
|
||||
Tool responses expose whether a write deduplicated an existing row via the
|
||||
`deduplicated` flag. `batch_store` also returns a `status` of either
|
||||
`stored` or `deduplicated` per entry.
|
||||
|
||||
## Agent Zero Developer Prompt
|
||||
|
||||
For Agent Zero / A0, add the following section to the Developer agent role
|
||||
|
||||
Reference in New Issue
Block a user