CameoDB Technical Whitepaper: The Hybrid-Search Engine

01 The Sync Gap Problem

In traditional architectures, writing data involves a "dual-write" problem. You write to the database, then asynchronously update the search index. This gap creates a state where search results are stale or inconsistent with the primary store.

Traditional Stack

Two clusters to manage
Eventual consistency lag
Complex failure recovery

CameoDB Solution

Single binary
Atomic writes (KV + Index)
Shared-nothing architecture

02 The Hybrid Storage Engine

At the core of CameoDB is a specialized storage engine that treats Redb (embedded KV) and Tantivy (Search) as a single atomic unit. This ensures that if a document is durably stored, it is also indexed.

The Atomic Write Pipeline

Sequence & WAL

A monotonic sequence ID is generated. The operation is serialized and written to the Write-Ahead Log (WAL) inside Redb for durability.

KV Insert

The document body is inserted into the Redb Data Table. This provides fast O(log N) retrieval by ID.

Tantivy Indexing

The document fields are parsed and added to the in-memory Tantivy IndexWriter.

Dual Commit

The Redb transaction commits (fsync). Tantivy performs a "smart commit" based on memory budget. The system guarantees that data in Redb is recoverable even if the index needs rebuilding.

03 Leaderless Distributed Mesh

CameoDB adopts a Self-Sovereign Identity model. There are no master nodes. Every node is capable of routing, writing, and reading.

Topology & Consistent Hashing

To ensure uniform distribution without hotspots, CameoDB utilizes a Consistent Hash Ring with 256 Virtual Nodes (VNodes) per physical node.

Deterministic Routing: Keys are hashed (XXH3) to a u64 token. The ring lookup is O(log N).
Minimal Rebalancing: When a node joins or leaves, only ~1/N keys need to move, preventing "thundering herd" network saturation.

Smart Routing Strategies

The `RouterActor` dynamically chooses between Unicast and Scatter-Gather based on the request context.

Unicast (Targeted)

O(1) Node

Used when a `routing_key` or `id` is provided. The request is sent directly to the owner node via libp2p.

> PUT /doc/123
> Hash("123") -> Node B
> Node A -> Node B (Direct)

Scatter-Gather

O(N) Nodes

Used for broad search queries. The router fans out the request to all nodes, aggregates results, and sorts by score.

> POST /search "query"
> Node A -> [A, B, C]
> Aggregate & Sort Hits

04 Engineered for Speed

CameoDB is built on Rust 2024, leveraging the Actor Model (Kameo) to ensure fault isolation. A critical architectural choice is the separation of Async and Sync workloads.

The `spawn_blocking` Firewall

Storage I/O (Redb/Tantivy) is strictly isolated in blocking thread pools. The network/actor runtime (Tokio/Axum) remains purely async. This prevents heavy disk writes from stalling HTTP request handling, keeping P99 latency low even during bulk ingestion.

05 Why Invest in CameoDB?

Simplification

Replaces two distinct infrastructure components (DB + Search) with one self-contained binary.

Efficiency

Written in Rust with safety built-in. No JVM. No Garbage Collection pauses. Smaller memory footprint than alternatives.

Adaptability

High scalability with a tunable footprint. Runs seamlessly on commodity hardware, yet scales effortlessly to massive cloud clusters.

Start building the future of data.

Deploy Now

The Unified Hybrid-Search
Database Engine

Executive Summary