SHELF (Shared, Harmonized, Eventually, Ledger, Fault-tolerant) is a peer-to-peer distributed shopping list application, featuring conflict-free replication (CRDT), quorum-based consistency, and automatic bootstrap failover.
Overview #
SHELF is designed as a resilient shopping list system that keeps working even when some peers are offline or unstable. Data is distributed through a consistent hashing ring with virtual nodes, while replication is controlled through configurable quorum parameters (N/W/R) so deployments can tune the balance between consistency and availability.
To keep state synchronized in real-world network conditions, the system uses hinted handoff, read repair, and active sync. If a write cannot be delivered, it is stored and retried periodically; when peers reconnect, pending updates are pushed and fresh state is pulled automatically. Merkle tree comparison avoids unnecessary full-state transfers, and delta-based writes reduce traffic by sending only changes instead of full documents.
Conflict handling is built around CRDT-based merging. OR-Set manages item membership, RGA preserves ordering behavior, and LWW-Register resolves property updates. Together, these structures allow replicas to converge safely without central coordination.
Bootstrap Cluster and Failover #
SHELF supports multiple bootstrap servers for high availability. At any moment, one bootstrap acts as leader and handles peer registration and heartbeat coordination, while other bootstraps remain in standby and forward requests to preserve consistency.
Leader election runs every five seconds. When the active leader fails, the next available bootstrap takes over automatically, typically within the same five-second window. This failover design keeps peer discovery and cluster coordination available without manual intervention.
Security and Time Integrity #
The project also includes safeguards against timestamp-related consistency issues. A local TimeService tracks server offset, enforces monotonic timestamps, and allows controlled backward correction after synchronization so nodes do not become stuck with invalid future time.
Incoming deltas are validated by a DeltaSanitizer before merge. Remote timestamps outside a ±1 hour tolerance are reset to Unix Epoch (0), which prevents suspicious future-dated updates from incorrectly winning Last-Writer-Wins decisions.
You can explore the code and more details in the repository below.