CameoDB Server
The server crate hosts the CameoDB node process: a high-performance, actor-based database node that combines local hybrid storage (redb + Tantivy) with distributed coordination, routing, and remote execution over libp2p + Kameo.
This document focuses on the server-side architecture and the distributed workflows currently implemented.
01 • Core Responsibilities
-
HTTP API surface — Accepts client requests (search, write, bulk write, admin) and translates them into strongly-typed operations (
ClientOp). -
Local orchestration — Manages
MicroshardActorinstances and their hybrid storage configuration while guaranteeing every redb/Tantivy call runs insidetokio::task::spawn_blocking. - Cluster coordination — Operates the libp2p swarm + Kademlia DHT, keeps peer/shard metadata fresh, and maintains the consistent-hash ring.
- Distributed execution — Leverages Kameo remote actors over libp2p to forward operations, implements scatter–gather broadcasts, and reports telemetry for all fan-out flows.
02 • Main Actors & Data Flow
2.1 RouterActor
The primary ingress for logical database work.
- Input:
ClientOpvariants (Search,Stream,Write,BulkWrite,GetConfig,CreateConfig,ListIndexes). - Responsibilities:
- Request a routing decision from
ClusterCoordinator(Local,Remote { node_id },Broadcast). - Execute the decision by delegating locally, calling
handle_remote/try_remotewith retries and timeouts, or fanning out to peers.
- Request a routing decision from
- Telemetry: Tracks broadcast totals/failures via
AtomicU64counters to surface fan-out health.
2.2 NodeOrchestrator
Owns node identity plus every local shard.
- Fields:
NodeIdentity(UUID, name, vnode tokens),HashMap<Uuid, MicroshardActor>, and aConsistentRing. - Startup: Hydrates shards from disk, registers ownership with the coordinator, and seeds schema caches.
- Message handling:
Message<ClientOp>→handle_client_op, which maps logical ops to shard work and routes storage throughspawn_blocking. - Remote capability: Derives
Actor+RemoteActor; registers itself asorchestrator-{node_id}for remoteask.
2.3 MicroshardActor
Encapsulates a shard’s hybrid store.
- Storage: redb for durable KV + WAL; Tantivy for full-text/vector search.
- Remote messages:
SearchRequest,WriteRequest,BatchWriteRequest, all yielding typed replies orRemoteError. - Isolation: Every storage call executes inside
spawn_blockingto protect the async runtime.
2.4 ClusterCoordinator & DistributedCluster
Global view of the mesh.
- Tracks:
peer_nodes,shard_assignments, and the consistent-hash ring. - Key messages:
InitSwarm/ShutdownSwarm,DiscoverPeers,GetStatus,RegisterLocalShards,RouteOperation,GetKnownPeers.
03 • Networking & Remote Actors
3.1 Libp2p Swarm & Kademlia
The server builds a libp2p swarm with TCP (nodelay), QUIC, Noise, Yamux, Identify, and Kademlia DHT. A combined DhtBehaviour hosts Kademlia, Identify, and kameo::remote::Behaviour:
swarm.behaviour_mut().kameo.init_global();
3.2 Remote Actor Naming & Registration
Stable names (orchestrator-{node_id}, future shard-{shard_id}) ensure discoverability. During startup the orchestrator computes the name via orchestrator_remote_name and calls register so other nodes can lookup it globally.
3.3 Remote Call Path (RouterActor::try_remote)
handle_remoteenforces retry + timeout policy.try_remotebuilds the orchestrator name, performsRemoteActorRef::lookup, and forwards the originalClientOp.- Failures bubble up as
OrchestratorErrorafter logging timeouts/transport faults.
04 • Routing & Distribution Semantics
4.1 Single-Key Read/Write Routing
The system derives a routing key using client input, document IDs, or a deterministic hash of document bytes. ClusterCoordinator::decide_route maps the key to a shard via ConsistentRing::get_owner and resolves the owning node. Metadata operations always run locally, backed by a schema cache inside NodeOrchestrator.
4.2 Broadcast / Scatter–Gather
When routing data is missing or the operation is unkeyed, the router:
- Fetches peers via
GetKnownPeersand appliesbroadcast_fanout_limit. - Kicks off a parallel local search plus remote
try_remotecalls (each wrapped intimeout(broadcast_timeout, …)). - Merges results (search hits sorted by
_score, write statistics, failed shard counts) and updates broadcast telemetry.
05 • Error Handling & Telemetry
- Error surface:
OrchestratorError(node orchestration) andRemoteError(microshard calls) with conversions so remote failures propagate cleanly. - Retries & timeouts:
remote_retry_attempts,remote_timeout, andbroadcast_timeoutbound all fan-out work. - Telemetry: broadcast totals/failures, per-attempt logging, and counters for remote retry exhaustion.
06 • Current Distributed Feature Coverage
- Hybrid-storage shards with blocking I/O isolation.
- Consistent-hash routing + shard assignment tracking.
- Remote execution of every
ClientOpover libp2p + Kameo. - Scatter–gather broadcast for unkeyed search and multi-node writes.
- Retries, timeouts, and telemetry hooks for resilience.
- Planned: remote actor caching, direct microshard addressing, configurable local-only modes, and cluster admin APIs.
07 • Distributed Flows (Sequence Diagrams)
7.1 Local Read/Write
7.2 Remote Read/Write
(via libp2p + kameo::remote::Behaviour) RemoteOrch->>RemoteShards: per-shard ops RemoteShards->>RemoteShards: redb + Tantivy via spawn_blocking RemoteShards-->>RemoteOrch: shard-level replies RemoteOrch-->>Router: aggregated JSON result end Router-->>HTTP: JSON result or OrchestratorError HTTP-->>Client: HTTP response
7.3 Broadcast Search (Scatter–Gather)
(via libp2p + kameo::remote::Behaviour) RemoteOrchN->>RemoteShardsN: shard-level search RemoteShardsN->>RemoteShardsN: Tantivy search via spawn_blocking RemoteShardsN-->>RemoteOrchN: hits per shard RemoteOrchN-->>Router: remote JSON result or error end end Router->>Router: merge hits (sort by score) Router-->>HTTP: aggregated JSON { hits, total_shards, failed_shards } HTTP-->>Client: HTTP response