CameoDB Server

The server crate hosts the CameoDB node process: a high-performance, actor-based database node that combines local hybrid storage (redb + Tantivy) with distributed coordination, routing, and remote execution over libp2p + Kameo.

This document focuses on the server-side architecture and the distributed workflows currently implemented.

01 • Core Responsibilities

HTTP API surface — Accepts client requests (search, write, bulk write, admin) and translates them into strongly-typed operations (ClientOp).
Local orchestration — Manages MicroshardActor instances and their hybrid storage configuration while guaranteeing every redb/Tantivy call runs inside tokio::task::spawn_blocking.
Cluster coordination — Operates the libp2p swarm + Kademlia DHT, keeps peer/shard metadata fresh, and maintains the consistent-hash ring.
Distributed execution — Leverages Kameo remote actors over libp2p to forward operations, implements scatter–gather broadcasts, and reports telemetry for all fan-out flows.

02 • Main Actors & Data Flow

2.1 RouterActor

The primary ingress for logical database work.

Input: ClientOp variants (Search, Stream, Write, BulkWrite, GetConfig, CreateConfig, ListIndexes).
Responsibilities:
- Request a routing decision from ClusterCoordinator (Local, Remote { node_id }, Broadcast).
- Execute the decision by delegating locally, calling handle_remote/try_remote with retries and timeouts, or fanning out to peers.
Telemetry: Tracks broadcast totals/failures via AtomicU64 counters to surface fan-out health.

2.2 NodeOrchestrator

Owns node identity plus every local shard.

Fields: NodeIdentity (UUID, name, vnode tokens), HashMap<Uuid, MicroshardActor>, and a ConsistentRing.
Startup: Hydrates shards from disk, registers ownership with the coordinator, and seeds schema caches.
Message handling: Message<ClientOp> → handle_client_op, which maps logical ops to shard work and routes storage through spawn_blocking.
Remote capability: Derives Actor + RemoteActor; registers itself as orchestrator-{node_id} for remote ask.

2.3 MicroshardActor

Encapsulates a shard’s hybrid store.

Storage: redb for durable KV + WAL; Tantivy for full-text/vector search.
Remote messages: SearchRequest, WriteRequest, BatchWriteRequest, all yielding typed replies or RemoteError.
Isolation: Every storage call executes inside spawn_blocking to protect the async runtime.

2.4 ClusterCoordinator & DistributedCluster

Global view of the mesh.

Tracks: peer_nodes, shard_assignments, and the consistent-hash ring.
Key messages: InitSwarm/ShutdownSwarm, DiscoverPeers, GetStatus, RegisterLocalShards, RouteOperation, GetKnownPeers.

03 • Networking & Remote Actors

3.1 Libp2p Swarm & Kademlia

The server builds a libp2p swarm with TCP (nodelay), QUIC, Noise, Yamux, Identify, and Kademlia DHT. A combined DhtBehaviour hosts Kademlia, Identify, and kameo::remote::Behaviour:

swarm.behaviour_mut().kameo.init_global();

3.2 Remote Actor Naming & Registration

Stable names (orchestrator-{node_id}, future shard-{shard_id}) ensure discoverability. During startup the orchestrator computes the name via orchestrator_remote_name and calls register so other nodes can lookup it globally.

3.3 Remote Call Path (`RouterActor::try_remote`)

handle_remote enforces retry + timeout policy.
try_remote builds the orchestrator name, performs RemoteActorRef::lookup, and forwards the original ClientOp.
Failures bubble up as OrchestratorError after logging timeouts/transport faults.

04 • Routing & Distribution Semantics

4.1 Single-Key Read/Write Routing

The system derives a routing key using client input, document IDs, or a deterministic hash of document bytes. ClusterCoordinator::decide_route maps the key to a shard via ConsistentRing::get_owner and resolves the owning node. Metadata operations always run locally, backed by a schema cache inside NodeOrchestrator.

4.2 Broadcast / Scatter–Gather

When routing data is missing or the operation is unkeyed, the router:

Fetches peers via GetKnownPeers and applies broadcast_fanout_limit.
Kicks off a parallel local search plus remote try_remote calls (each wrapped in timeout(broadcast_timeout, …)).
Merges results (search hits sorted by _score, write statistics, failed shard counts) and updates broadcast telemetry.

05 • Error Handling & Telemetry

Error surface: OrchestratorError (node orchestration) and RemoteError (microshard calls) with conversions so remote failures propagate cleanly.
Retries & timeouts: remote_retry_attempts, remote_timeout, and broadcast_timeout bound all fan-out work.
Telemetry: broadcast totals/failures, per-attempt logging, and counters for remote retry exhaustion.

06 • Current Distributed Feature Coverage

Hybrid-storage shards with blocking I/O isolation.
Consistent-hash routing + shard assignment tracking.
Remote execution of every ClientOp over libp2p + Kameo.
Scatter–gather broadcast for unkeyed search and multi-node writes.
Retries, timeouts, and telemetry hooks for resilience.
Planned: remote actor caching, direct microshard addressing, configurable local-only modes, and cluster admin APIs.

07 • Distributed Flows (Sequence Diagrams)

7.1 Local Read/Write

sequenceDiagram autonumber participant Client participant HTTP as "HTTP API" participant Router as RouterActor participant Coord as ClusterCoordinator participant Orchestrator as NodeOrchestrator participant Shards as MicroshardActors Client->>HTTP: HTTP request (Search / Write / BulkWrite) HTTP->>Router: ClientOp Router->>Coord: RouteOperation { routing_key, op_type } Coord-->>Router: RoutingDecision::Local Router->>Orchestrator: ClientOp Orchestrator->>Shards: per-shard ops (via Actor messages) Shards->>Shards: redb + Tantivy via spawn_blocking Shards-->>Orchestrator: shard-level replies Orchestrator-->>Router: aggregated JSON result Router-->>HTTP: JSON result HTTP-->>Client: HTTP response

7.2 Remote Read/Write

sequenceDiagram autonumber participant Client participant HTTP as "HTTP API" participant Router as "RouterActor (local)" participant Coord as "ClusterCoordinator (local)" participant RemoteOrch as "NodeOrchestrator (remote node)" participant RemoteShards as MicroshardActors (remote) Client->>HTTP: HTTP request (Search / Write / BulkWrite) HTTP->>Router: ClientOp Router->>Coord: RouteOperation { routing_key, op_type } Coord-->>Router: RoutingDecision::Remote { node_id, peer_addr } loop up to remote_retry_attempts Router->>Router: handle_remote (timeout + retry) Router->>RemoteOrch: Kameo remote ask(ClientOp)
(via libp2p + kameo::remote::Behaviour) RemoteOrch->>RemoteShards: per-shard ops RemoteShards->>RemoteShards: redb + Tantivy via spawn_blocking RemoteShards-->>RemoteOrch: shard-level replies RemoteOrch-->>Router: aggregated JSON result end Router-->>HTTP: JSON result or OrchestratorError HTTP-->>Client: HTTP response

7.3 Broadcast Search (Scatter–Gather)

sequenceDiagram autonumber participant Client participant HTTP as "HTTP API" participant Router as "RouterActor (local)" participant Coord as ClusterCoordinator participant LocalOrch as "NodeOrchestrator (local)" participant LocalShards as "MicroshardActors (local)" participant RemoteOrchN as "NodeOrchestrator (remote peers)" participant RemoteShardsN as "MicroshardActors (remote peers)" Client->>HTTP: HTTP search without routing_key HTTP->>Router: ClientOp::Search Router->>Coord: RouteOperation { routing_key, op_type } Coord-->>Router: RoutingDecision::Broadcast Router->>Coord: GetKnownPeers Coord-->>Router: Vec(KnownPeer) list Router->>Router: select up to broadcast_fanout_limit peers par Local search Router->>LocalOrch: ClientOp::Search LocalOrch->>LocalShards: shard-level search LocalShards->>LocalShards: Tantivy search via spawn_blocking LocalShards-->>LocalOrch: hits per shard LocalOrch-->>Router: local JSON result and Remote fan-out loop for each selected peer Router->>RemoteOrchN: Kameo remote ask(ClientOp)
(via libp2p + kameo::remote::Behaviour) RemoteOrchN->>RemoteShardsN: shard-level search RemoteShardsN->>RemoteShardsN: Tantivy search via spawn_blocking RemoteShardsN-->>RemoteOrchN: hits per shard RemoteOrchN-->>Router: remote JSON result or error end end Router->>Router: merge hits (sort by score) Router-->>HTTP: aggregated JSON { hits, total_shards, failed_shards } HTTP-->>Client: HTTP response