Distributed Transactions
The Challenge of Distributed Transactions
Section titled “The Challenge of Distributed Transactions”In a single database, transactions are straightforward: all operations succeed or all fail (atomicity). But what happens when your transaction spans multiple services or databases?
What is a Distributed Transaction?
Section titled “What is a Distributed Transaction?”A distributed transaction is a transaction that spans multiple services or databases. It requires coordination to ensure atomicity: all operations succeed or all fail.
The Challenge: How do you ensure all succeed or all fail when services are distributed?
Solution 1: Two-Phase Commit (2PC)
Section titled “Solution 1: Two-Phase Commit (2PC)”Two-Phase Commit (2PC) is a protocol that coordinates distributed transactions through two phases.
How 2PC Works
Section titled “How 2PC Works”Phase 1: Prepare (Voting)
Section titled “Phase 1: Prepare (Voting)”- Coordinator sends “prepare” message to all participants
- Each participant:
- Locks resources
- Performs validation
- Votes YES (ready) or NO (abort)
- Participants cannot abort after voting YES (they’re locked)
Phase 2: Commit or Abort
Section titled “Phase 2: Commit or Abort”If all vote YES:
- Coordinator sends “commit” to all
- Participants commit and release locks
If any votes NO:
- Coordinator sends “abort” to all
- Participants rollback and release locks
2PC Flow Diagram
Section titled “2PC Flow Diagram”Problems with 2PC
Section titled “Problems with 2PC”| Problem | Description | Impact |
|---|---|---|
| Blocking | Nodes wait if coordinator fails | Resources locked indefinitely |
| SPOF | Coordinator is critical | System fails if coordinator dies |
| Performance | Synchronous, blocking | Slow, doesn’t scale |
| Partitions | Doesn’t handle network splits | Can’t make progress during partitions |
Solution 2: Saga Pattern
Section titled “Solution 2: Saga Pattern”The Saga pattern breaks a distributed transaction into a sequence of local transactions. Each step has a compensating action that undoes it.
How Saga Works
Section titled “How Saga Works”Key Difference from 2PC: Saga uses compensation (undo operations) instead of rollback (database transaction rollback).
What is Compensation?
Section titled “What is Compensation?”Compensation is an operation that undoes the effects of a previous operation. Unlike database rollback, compensation may not perfectly reverse the operation, but makes the system consistent.
Example: You can’t “unsend” an email, but you can send an apology email. That’s compensation!
Saga Orchestration vs Choreography
Section titled “Saga Orchestration vs Choreography”Orchestration: Central Coordinator
Section titled “Orchestration: Central Coordinator”Characteristics:
- ✅ Centralized control (easier to understand)
- ✅ Easier to monitor and debug
- ❌ Single point of failure (orchestrator)
- ❌ Tighter coupling
Choreography: Event-Driven
Section titled “Choreography: Event-Driven”Characteristics:
- ✅ No single point of failure
- ✅ Loosely coupled services
- ❌ Harder to understand flow
- ❌ Harder to monitor
Saga Example: E-Commerce Order
Section titled “Saga Example: E-Commerce Order”If Step 2 (Charge Payment) fails:
- Compensate Step 1: Release reserved inventory
- Transaction ends (no further steps)
If Step 4 (Send Notification) fails:
- Steps 1-3 already succeeded (can’t undo shipping!)
- Send apology notification (compensation)
LLD ↔ HLD Connection
Section titled “LLD ↔ HLD Connection”How distributed transactions affect your class design:
Saga Orchestrator Implementation
Section titled “Saga Orchestrator Implementation”from enum import Enumfrom typing import List, Callable, Optional
class SagaStep: def __init__(self, name: str, execute: Callable, compensate: Callable): self.name = name self.execute = execute self.compensate = compensate self.completed = False
class SagaOrchestrator: def __init__(self): self.steps: List[SagaStep] = []
def add_step(self, step: SagaStep): self.steps.append(step)
def execute(self) -> bool: completed_steps = []
try: for step in self.steps: # Execute step if not step.execute(): raise Exception(f"Step {step.name} failed")
step.completed = True completed_steps.append(step)
return True # All succeeded
except Exception as e: # Compensate in reverse order for step in reversed(completed_steps): step.compensate()
return False # Transaction failedimport java.util.*;
public class SagaStep { private String name; private Runnable execute; private Runnable compensate; private boolean completed = false;
public SagaStep(String name, Runnable execute, Runnable compensate) { this.name = name; this.execute = execute; this.compensate = compensate; }
public boolean execute() { try { execute.run(); completed = true; return true; } catch (Exception e) { return false; } }
public void compensate() { compensate.run(); }}
public class SagaOrchestrator { private List<SagaStep> steps = new ArrayList<>();
public void addStep(SagaStep step) { steps.add(step); }
public boolean execute() { List<SagaStep> completedSteps = new ArrayList<>();
try { for (SagaStep step : steps) { if (!step.execute()) { throw new RuntimeException("Step " + step.name + " failed"); } completedSteps.add(step); } return true; // All succeeded } catch (Exception e) { // Compensate in reverse order Collections.reverse(completedSteps); for (SagaStep step : completedSteps) { step.compensate(); } return false; // Transaction failed } }}Compensation Pattern
Section titled “Compensation Pattern”class InventoryService: def reserve(self, product_id: str, quantity: int) -> bool: # Reserve inventory # Returns True if successful pass
def release(self, product_id: str, quantity: int): # Compensate: Release reserved inventory pass
class PaymentService: def charge(self, user_id: str, amount: float) -> bool: # Charge payment # Returns True if successful pass
def refund(self, user_id: str, amount: float): # Compensate: Refund payment pass
# Usage in Sagadef create_order_saga(): orchestrator = SagaOrchestrator()
inventory = InventoryService() payment = PaymentService()
orchestrator.add_step(SagaStep( "reserve_inventory", lambda: inventory.reserve("product_123", 1), lambda: inventory.release("product_123", 1) ))
orchestrator.add_step(SagaStep( "charge_payment", lambda: payment.charge("user_456", 99.99), lambda: payment.refund("user_456", 99.99) ))
return orchestrator.execute()public class InventoryService { public boolean reserve(String productId, int quantity) { // Reserve inventory // Returns true if successful return true; }
public void release(String productId, int quantity) { // Compensate: Release reserved inventory }}
public class PaymentService { public boolean charge(String userId, double amount) { // Charge payment // Returns true if successful return true; }
public void refund(String userId, double amount) { // Compensate: Refund payment }}
// Usage in Sagapublic class OrderSaga { public boolean createOrder() { SagaOrchestrator orchestrator = new SagaOrchestrator(); InventoryService inventory = new InventoryService(); PaymentService payment = new PaymentService();
orchestrator.addStep(new SagaStep( "reserve_inventory", () -> inventory.reserve("product_123", 1), () -> inventory.release("product_123", 1) ));
orchestrator.addStep(new SagaStep( "charge_payment", () -> payment.charge("user_456", 99.99), () -> payment.refund("user_456", 99.99) ));
return orchestrator.execute(); }}2PC vs Saga: When to Use What?
Section titled “2PC vs Saga: When to Use What?”| Aspect | 2PC | Saga |
|---|---|---|
| Consistency | Strong (ACID) | Eventual |
| Performance | Slow (blocking) | Fast (non-blocking) |
| Availability | Low (blocking) | High (non-blocking) |
| Complexity | Medium | High (need compensation) |
| Use Cases | Critical, short transactions | Long-running, distributed |
Deep Dive: Production Considerations for Distributed Transactions
Section titled “Deep Dive: Production Considerations for Distributed Transactions”2PC vs Saga: Production Trade-offs
Section titled “2PC vs Saga: Production Trade-offs”Performance Comparison
Section titled “Performance Comparison”2PC Performance:
- Latency: 100-500ms per transaction (synchronous coordination)
- Throughput: 100-1K transactions/sec (limited by coordinator)
- Blocking: All participants blocked during prepare phase
- Failure Recovery: 10-60 seconds (must query all participants)
Saga Performance:
- Latency: 50-200ms per step (asynchronous execution)
- Throughput: 1K-10K transactions/sec (no blocking)
- Non-blocking: Steps execute independently
- Failure Recovery: 1-5 seconds (compensate only completed steps)
Real-World Benchmark:
- E-commerce checkout (2PC): ~300ms, 500 TPS
- E-commerce checkout (Saga): ~150ms, 3K TPS
- Improvement: Saga is 2x faster, 6x more throughput
Saga Pattern: Advanced Scenarios
Section titled “Saga Pattern: Advanced Scenarios”Scenario 1: Partial Failure Recovery
Section titled “Scenario 1: Partial Failure Recovery”Problem: What if compensation itself fails?
Example:
- Step 1: Reserve inventory ✅
- Step 2: Charge payment ✅
- Step 3: Ship order ❌ (fails)
- Compensate Step 2: Refund ❌ (fails!)
Solutions:
Solution 1: Retry Compensation
class SagaOrchestrator: def compensate_with_retry(self, step, max_retries=3): for attempt in range(max_retries): try: step.compensate() return True except Exception as e: if attempt == max_retries - 1: # Log for manual intervention self.alert_manual_intervention(step, e) return False time.sleep(2 ** attempt) # Exponential backoffSolution 2: Compensating Transaction Pattern
- Store compensation intent in database
- Background job retries failed compensations
- Benefit: Eventually consistent compensation
Solution 3: Manual Intervention Queue
- Failed compensations go to queue
- Operations team handles manually
- Use case: Critical financial transactions
Scenario 2: Long-Running Sagas
Section titled “Scenario 2: Long-Running Sagas”Problem: Sagas that take hours/days (e.g., order fulfillment).
Challenges:
- State persistence: Must survive server restarts
- Timeout handling: Steps may timeout
- Idempotency: Steps may be retried
Solution: Persistent Saga State
class PersistentSagaOrchestrator: def __init__(self, db_connection): self.db = db_connection
def execute_saga(self, saga_id, steps): # Save saga state self.db.save_saga_state(saga_id, { 'status': 'running', 'current_step': 0, 'completed_steps': [], 'compensated_steps': [] })
try: for i, step in enumerate(steps): # Load state (survives restarts) state = self.db.load_saga_state(saga_id)
if i in state['completed_steps']: continue # Already completed (idempotent)
# Execute step if step.execute(): state['completed_steps'].append(i) self.db.save_saga_state(saga_id, state) else: raise Exception(f"Step {i} failed") except Exception as e: # Compensate completed steps self.compensate_saga(saga_id, state['completed_steps'])Scenario 3: Saga Orchestration vs Choreography: Production Comparison
Section titled “Scenario 3: Saga Orchestration vs Choreography: Production Comparison”Orchestration (Central Coordinator):
Pros:
- ✅ Easier to understand: Centralized control flow
- ✅ Better monitoring: Single point to observe
- ✅ Easier debugging: All logic in one place
- ✅ Transaction visibility: Can see entire saga state
Cons:
- ❌ Single point of failure: Coordinator is critical
- ❌ Scalability bottleneck: Coordinator can become overloaded
- ❌ Tighter coupling: Services depend on orchestrator
Production Example: Netflix Conductor
- Centralized orchestration engine
- Handles millions of workflows
- Scaling: Multiple coordinator instances with shared state
Choreography (Event-Driven):
Pros:
- ✅ No single point of failure: Distributed control
- ✅ Better scalability: No coordinator bottleneck
- ✅ Looser coupling: Services communicate via events
Cons:
- ❌ Harder to understand: Distributed control flow
- ❌ Harder to monitor: Must trace events across services
- ❌ Harder to debug: No single place to see state
- ❌ Transaction visibility: Hard to see saga state
Production Example: Event Sourcing + CQRS
- Events stored in event log
- Services react to events
- Monitoring: Event log provides audit trail
Hybrid Approach:
- Use orchestration for critical, complex workflows
- Use choreography for simple, high-volume workflows
- Best of both worlds: Flexibility + visibility
2PC: When It’s Still Used
Section titled “2PC: When It’s Still Used”Despite its problems, 2PC is still used in production:
Use Case 1: Database Clusters
- PostgreSQL: Uses 2PC for distributed transactions
- MySQL Cluster: Uses 2PC for multi-master replication
- Why: Database-level coordination, not application-level
Use Case 2: XA Transactions
- Java EE: JTA (Java Transaction API) uses 2PC
- Spring: @Transactional with XA datasources
- Why: Standard protocol, well-understood
Use Case 3: Short-Lived Transactions
- Microsecond transactions: 2PC overhead acceptable
- Low concurrency: Blocking not a problem
- Why: Simpler than Saga for simple cases
Production Pattern:
// Use 2PC for simple, fast transactions@Transactional(rollbackFor = Exception.class)public void transferMoney(Account from, Account to, double amount) { from.debit(amount); to.credit(amount); // 2PC handles coordination}Compensation Design Patterns
Section titled “Compensation Design Patterns”Pattern 1: Forward Recovery (Retry)
Section titled “Pattern 1: Forward Recovery (Retry)”Pattern: Retry failed step instead of compensating.
When to use:
- Transient failures: Network hiccups, temporary unavailability
- Idempotent operations: Safe to retry
Example:
class RetrySagaStep: def execute(self, max_retries=3): for attempt in range(max_retries): try: return self.do_work() except TransientError as e: if attempt == max_retries - 1: raise # Give up after retries time.sleep(2 ** attempt) # Exponential backoffPattern 2: Backward Recovery (Compensation)
Section titled “Pattern 2: Backward Recovery (Compensation)”Pattern: Compensate completed steps (standard Saga).
When to use:
- Permanent failures: Business logic errors
- Non-idempotent operations: Can’t safely retry
Example: Already covered in main content
Pattern 3: Compensation with TTL
Section titled “Pattern 3: Compensation with TTL”Pattern: Compensations expire after time limit.
Use case: Prevent stale compensations from running.
Example:
class TimedCompensation: def __init__(self, ttl_seconds=3600): self.ttl = ttl_seconds self.created_at = time.time()
def compensate(self): if time.time() - self.created_at > self.ttl: raise CompensationExpired("Too much time has passed") # Proceed with compensationProduction Best Practices
Section titled “Production Best Practices”Practice 1: Idempotency Keys
Section titled “Practice 1: Idempotency Keys”Pattern: Use idempotency keys to prevent duplicate execution.
Example:
class IdempotentSagaStep: def execute(self, idempotency_key): # Check if already executed if self.db.exists(idempotency_key): return self.db.get_result(idempotency_key)
# Execute and store result result = self.do_work() self.db.store(idempotency_key, result) return resultBenefits:
- ✅ Safe to retry
- ✅ Prevents duplicate execution
- ✅ Handles network retries
Practice 2: Saga Timeout Management
Section titled “Practice 2: Saga Timeout Management”Pattern: Set timeouts for saga execution.
Example:
class TimedSaga: def execute(self, timeout_seconds=300): start_time = time.time()
for step in self.steps: if time.time() - start_time > timeout_seconds: raise SagaTimeout("Saga exceeded timeout") step.execute()Production Settings:
- Short sagas: 30-60 seconds (API calls)
- Medium sagas: 5-10 minutes (order processing)
- Long sagas: Hours/days (fulfillment workflows)
Practice 3: Saga Monitoring and Alerting
Section titled “Practice 3: Saga Monitoring and Alerting”What to Monitor:
- Saga completion rate: Should be >99%
- Average saga duration: Track P50, P95, P99
- Compensation rate: High rate indicates problems
- Failed sagas: Alert on failures
Production Metrics:
class SagaMetrics: def record_saga(self, saga_id, duration, success): self.metrics.increment('saga.total') self.metrics.histogram('saga.duration', duration)
if success: self.metrics.increment('saga.success') else: self.metrics.increment('saga.failure') self.alert_on_failure(saga_id)Performance Benchmarks: Real-World Numbers
Section titled “Performance Benchmarks: Real-World Numbers”| Pattern | Latency | Throughput | Failure Recovery | Use Case |
|---|---|---|---|---|
| 2PC | 100-500ms | 100-1K TPS | 10-60s | Database clusters |
| Saga (Orchestration) | 50-200ms | 1K-10K TPS | 1-5s | Microservices |
| Saga (Choreography) | 30-150ms | 5K-50K TPS | 1-3s | Event-driven |
| Local Transaction | 5-20ms | 10K-100K TPS | N/A | Single service |
Key Insights:
- Saga is 2-5x faster than 2PC
- Choreography is faster than orchestration (no coordinator overhead)
- Local transactions are 10x faster (no coordination needed)
Key Takeaways
Section titled “Key Takeaways”What’s Next?
Section titled “What’s Next?”Now that you understand distributed transactions, let’s explore how to handle conflicts when multiple operations happen simultaneously:
Next up: Conflict Resolution Strategies — Learn last-write-wins, vector clocks, and CRDTs for handling concurrent updates.