Based on the Dynamo: Amazon's Highly Available Key-value Store paper (SOSP 2007)
A working simulator of Amazon's Dynamo — the distributed key-value store that powers services like the shopping cart — with a real-time visualization dashboard that lets you interact with all 6 core distributed systems algorithms from the paper.
Kill nodes. Create conflicts. Watch self-healing. Understand distributed systems by breaking them.
- Interactive Hash Ring — see how consistent hashing distributes keys across nodes with virtual nodes
- Chaos Engineering — kill/revive nodes from the dashboard and watch sloppy quorum + hinted handoff in action
- Vector Clock Timeline — visualize version history, causal ordering, and conflict detection
- Merkle Tree Sync — see which replicas are in sync vs divergent
- Gossip Protocol — watch how membership info propagates through the cluster
- Tunable Quorum (N, R, W) — change parameters live and observe consistency vs availability tradeoffs
- Operation Tracing — every GET/PUT shows coordinator selection, preference list, replication, and quorum status
Dynamo trades consistency for availability — it never rejects a write, even during failures. Here's how each algorithm contributes:
| # | Algorithm | Problem It Solves | How |
|---|---|---|---|
| 1 | Consistent Hashing | Where does data live? | Keys and nodes are placed on a ring. Walking clockwise from a key's hash position finds its owner. Adding/removing a node only moves ~1/N of data. |
| 2 | Replication | What if a server dies? | Every key is stored on N nodes (the "preference list") — the N distinct physical nodes found walking clockwise from the key. |
| 3 | Vector Clocks | Which version is newer? | Each write carries a [(node, counter)] list. If two versions have counters that don't dominate each other → conflict detected. |
| 4 | Quorum (N, R, W) | How consistent should reads be? | Writes need W acks, reads need R responses. When R+W > N → at least one node overlaps → reads see latest write. |
| 5 | Sloppy Quorum + Hinted Handoff | What if the designated node is down? | Skip failed nodes; a healthy node temporarily holds the data with a "hint". When the original recovers → data is delivered back. |
| 6 | Merkle Trees | Are replicas in sync? | A hash tree lets two nodes find differing keys in O(log N) comparisons instead of comparing every value. |
| 7 | Gossip Protocol | How do nodes discover each other? | Every second, each node tells a random peer what it knows. Info spreads exponentially — all nodes converge in O(log N) rounds. |
Client → PUT("cart_42", "laptop,mouse")
1. Request hits any node (via load balancer)
2. Node checks preference list for "cart_42" → [B, C, D]
3. Forwards to Node B (coordinator — first in list)
4. B generates vector clock: [(B, 1)]
5. B writes locally ✓
6. B sends to C → C writes ✓ ACK
7. B sends to D → D writes ✓ ACK
8. W=2 acks received → SUCCESS returned to client
If Node C was DOWN:
→ B sends to E instead (sloppy quorum)
→ E stores with hint: "this belongs to C"
→ When C recovers: E delivers data → C catches up
┌─────────────────────────────────────────────────────────┐
│ Next.js Frontend │
│ │
│ ┌──────────┐ ┌──────────────┐ ┌───────────────────┐ │
│ │ Hash Ring │ │ Cluster │ │ Operations │ │
│ │ Canvas │ │ Status Panel │ │ Console (GET/PUT) │ │
│ └──────────┘ └──────────────┘ └───────────────────┘ │
│ ┌────────────┐ ┌──────────┐ ┌────────┐ ┌───────────┐ │
│ │Vector Clock│ │ Gossip │ │ Merkle │ │ Event Log │ │
│ │ Timeline │ │ Protocol │ │ Trees │ │ │ │
│ └────────────┘ └──────────┘ └────────┘ └───────────┘ │
└──────────────────────┬──────────────────────────────────┘
│ HTTP REST API
┌──────────────────────▼──────────────────────────────────┐
│ Python Backend │
│ │
│ ┌─────────────────────────────────────────────────────┐│
│ │ DynamoCluster ││
│ │ ││
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ││
│ │ │ Node A │ │ Node B │ │ Node C │ ... ││
│ │ │┌───────┐│ │┌───────┐│ │┌───────┐│ ││
│ │ ││Storage││ ││Storage││ ││Storage││ ││
│ │ │├───────┤│ │├───────┤│ │├───────┤│ ││
│ │ ││VClocks││ ││VClocks││ ││VClocks││ ││
│ │ │├───────┤│ │├───────┤│ │├───────┤│ ││
│ │ ││Gossip ││ ││Gossip ││ ││Gossip ││ ││
│ │ │├───────┤│ │├───────┤│ │├───────┤│ ││
│ │ ││Hints ││ ││Hints ││ ││Hints ││ ││
│ │ │└───────┘│ │└───────┘│ │└───────┘│ ││
│ │ └─────────┘ └─────────┘ └─────────┘ ││
│ │ ││
│ │ ┌──────────────────────────────────────────┐ ││
│ │ │ Shared: HashRing │ FailureDetector │ ││
│ │ └──────────────────────────────────────────┘ ││
│ └─────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────┘
| Layer | Technology | Purpose |
|---|---|---|
| Frontend | Next.js 16, TypeScript, CSS Modules | Dashboard UI with real-time visualizations |
| Ring Viz | HTML5 Canvas | Animated consistent hash ring with node/key placement |
| Backend | Python 3.12+, aiohttp | Simulated multi-node Dynamo cluster |
| Storage | In-memory (pluggable) | Per-node storage engine with version management |
| Protocol | REST over HTTP | Frontend ↔ Backend communication |
- Python 3.12+
- Node.js 18+
- npm
git clone https://github.com/Debrupbanik/dynamo-implementation.git
cd dynamo-implementationcd backend
pip install -r requirements.txt
PYTHONPATH=. python3 server.pyThe backend starts with a 5-node cluster (Node A through E) on http://localhost:8001.
cd frontend
npm install
npm run devOpen http://localhost:3000 — the dashboard connects to the backend automatically.
- PUT data — type a key like
cart_42and value like{"item": "laptop"}, click PUT - See replication — the response shows coordinator, preference list, and which nodes received replicas
- Kill a node — click the Kill button on any node in the left sidebar
- PUT again — watch sloppy quorum route to a different node and create a hinted handoff
- Revive the node — click Revive and see hints get delivered back
- Tune N, R, W — change quorum parameters in the top bar and observe the tradeoffs
| Method | Endpoint | Description |
|---|---|---|
GET |
/api/cluster |
Full cluster state |
POST |
/api/cluster/config |
Update N, R, W parameters |
| Method | Endpoint | Description |
|---|---|---|
POST |
/api/nodes |
Add a node |
DELETE |
/api/nodes/:id |
Remove a node permanently |
POST |
/api/nodes/:id/kill |
Simulate node failure |
POST |
/api/nodes/:id/revive |
Simulate recovery + deliver hints |
| Method | Endpoint | Description |
|---|---|---|
GET |
/api/data |
List all stored keys |
GET |
/api/data/:key |
Read a key (with quorum) |
PUT |
/api/data/:key |
Write a key (with replication) |
| Method | Endpoint | Description |
|---|---|---|
GET |
/api/ring |
Hash ring state |
GET |
/api/ring/key/:key |
Key placement on ring |
GET |
/api/merkle |
Merkle tree comparisons |
GET |
/api/versions/:key |
Vector clock history |
GET |
/api/gossip |
Gossip protocol state |
GET |
/api/events |
Cluster event log |
dynamo-implementation/
├── backend/
│ ├── server.py # HTTP API server + DynamoCluster orchestrator
│ ├── requirements.txt
│ ├── Dockerfile
│ └── dynamo/
│ ├── config.py # N, R, W and all tunable parameters
│ ├── node.py # DynamoNode — main per-node orchestrator
│ ├── partitioning/
│ │ └── hash_ring.py # Consistent hashing with virtual nodes
│ ├── versioning/
│ │ ├── vector_clock.py # Vector clocks for causal versioning
│ │ └── versioned_object.py # Object wrapper with version metadata
│ ├── storage/
│ │ ├── base.py # Abstract storage engine interface
│ │ └── memory.py # In-memory storage with reconciliation
│ ├── membership/
│ │ ├── gossip.py # Gossip-based membership protocol
│ │ └── failure_detector.py # Local failure detection
│ ├── failure_handling/
│ │ └── hinted_handoff.py # Hinted handoff for temp failures
│ ├── anti_entropy/
│ │ └── merkle_tree.py # Merkle trees for replica sync
│ └── reconciliation/
│ └── strategies.py # Syntactic, LWW, and merge reconciliation
│
└── frontend/
└── src/
├── app/
│ ├── page.tsx # Main dashboard page
│ ├── page.module.css
│ ├── layout.tsx
│ └── globals.css # Design system
├── components/
│ ├── HashRingCanvas.tsx # Interactive ring visualization
│ ├── ClusterStatus.tsx # Node status sidebar
│ ├── OperationsConsole.tsx # GET/PUT interface
│ └── InfoPanels.tsx # Vector clocks, gossip, Merkle, events
└── lib/
└── api.ts # Backend API client
Dynamo: Amazon's Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels SOSP '07: Proceedings of the 21st ACM Symposium on Operating Systems Principles PDF
MIT