logo

CameoDB – An open-source, shared-nothing hybrid-search database in Rust

Posted by gorancv |3 hours ago |1 comments

gorancv 3 hours ago

Hi HN,

Today, I'm open-sourcing CameoDB after nine months of hacking on it week by week as a side project.

I built this because I wanted a simple, scalable platform to serve as a unified knowledge base and context offering for both users and AI tools. Specifically, I was frustrated with the "dual-write" problem and the operational overhead of managing separate clusters for transactional storage and full-text search when trying to build a reliable, real-time context for AI applications.

Under the hood:

It’s a high-performance, distributed, shared-nothing database written in Rust 2024, utilizing an Actor Model architecture (Kameo).

Storage Engine: It combines ACID-compliant Key-Value storage (Redb), flexible document modeling, and Full-Text Search (Tantivy) into a single atomic unit. If a document is durably stored in the WAL, it is instantly indexed for search.

Data Ingestion & Schema Inference: It includes a native data loader capable of fetching from local files, remote URLs, and web targets. It automatically detects schemas, allowing you to go from raw data to a fully searchable index instantly.

Concurrency: We enforce strict isolation using spawn_blocking to keep storage I/O out of the Tokio/Axum async runtime, ensuring ultra-low P99 latency even during bulk ingestions.

AI Ready: It embeds a JSON-RPC 2.0 over SSE Model Context Protocol (MCP) server directly into the core engine so agents like Claude or Cursor can query it instantly.

Topology: It’s a multi-tenant, horizontally scalable mesh utilizing Consistent Hashing (256 VNodes/node) and XXH3 deterministic routing.

Current state & future roadmap: It is still early days. While the core atomic pipeline and MCP integrations are solid, there is still plenty of space for improvements and exciting new features. I have outlined the upcoming plans and feature roadmap in the development documents within the project repository.

I’ve just moved it from my personal workspace to a public org. I’d love for you to tear apart the codebase, tell me where the architecture falls short, and let me know if this solves a problem for you, too.

Website/Docs: https://cameodb.com Repo: https://github.com/cameodb/cameodb