Skip to content
Low Level Design Mastery Logo
LowLevelDesign Mastery

Distributed Transactions

Coordinating operations across distributed systems

In a single database, transactions are straightforward: all operations succeed or all fail (atomicity). But what happens when your transaction spans multiple services or databases?

Diagram

A distributed transaction is a transaction that spans multiple services or databases. It requires coordination to ensure atomicity: all operations succeed or all fail.

Diagram

The Challenge: How do you ensure all succeed or all fail when services are distributed?


Two-Phase Commit (2PC) is a protocol that coordinates distributed transactions through two phases.

Diagram
  1. Coordinator sends “prepare” message to all participants
  2. Each participant:
    • Locks resources
    • Performs validation
    • Votes YES (ready) or NO (abort)
  3. Participants cannot abort after voting YES (they’re locked)

If all vote YES:

  • Coordinator sends “commit” to all
  • Participants commit and release locks

If any votes NO:

  • Coordinator sends “abort” to all
  • Participants rollback and release locks
Diagram Diagram
ProblemDescriptionImpact
BlockingNodes wait if coordinator failsResources locked indefinitely
SPOFCoordinator is criticalSystem fails if coordinator dies
PerformanceSynchronous, blockingSlow, doesn’t scale
PartitionsDoesn’t handle network splitsCan’t make progress during partitions

The Saga pattern breaks a distributed transaction into a sequence of local transactions. Each step has a compensating action that undoes it.

Diagram

Key Difference from 2PC: Saga uses compensation (undo operations) instead of rollback (database transaction rollback).


Compensation is an operation that undoes the effects of a previous operation. Unlike database rollback, compensation may not perfectly reverse the operation, but makes the system consistent.

Diagram

Example: You can’t “unsend” an email, but you can send an apology email. That’s compensation!


Diagram

Characteristics:

  • ✅ Centralized control (easier to understand)
  • ✅ Easier to monitor and debug
  • ❌ Single point of failure (orchestrator)
  • ❌ Tighter coupling

Diagram

Characteristics:

  • ✅ No single point of failure
  • ✅ Loosely coupled services
  • ❌ Harder to understand flow
  • ❌ Harder to monitor

Diagram

If Step 2 (Charge Payment) fails:

  1. Compensate Step 1: Release reserved inventory
  2. Transaction ends (no further steps)

If Step 4 (Send Notification) fails:

  1. Steps 1-3 already succeeded (can’t undo shipping!)
  2. Send apology notification (compensation)

How distributed transactions affect your class design:

Saga Orchestrator
from enum import Enum
from typing import List, Callable, Optional
class SagaStep:
def __init__(self, name: str, execute: Callable, compensate: Callable):
self.name = name
self.execute = execute
self.compensate = compensate
self.completed = False
class SagaOrchestrator:
def __init__(self):
self.steps: List[SagaStep] = []
def add_step(self, step: SagaStep):
self.steps.append(step)
def execute(self) -> bool:
completed_steps = []
try:
for step in self.steps:
# Execute step
if not step.execute():
raise Exception(f"Step {step.name} failed")
step.completed = True
completed_steps.append(step)
return True # All succeeded
except Exception as e:
# Compensate in reverse order
for step in reversed(completed_steps):
step.compensate()
return False # Transaction failed
Compensation Pattern
class InventoryService:
def reserve(self, product_id: str, quantity: int) -> bool:
# Reserve inventory
# Returns True if successful
pass
def release(self, product_id: str, quantity: int):
# Compensate: Release reserved inventory
pass
class PaymentService:
def charge(self, user_id: str, amount: float) -> bool:
# Charge payment
# Returns True if successful
pass
def refund(self, user_id: str, amount: float):
# Compensate: Refund payment
pass
# Usage in Saga
def create_order_saga():
orchestrator = SagaOrchestrator()
inventory = InventoryService()
payment = PaymentService()
orchestrator.add_step(SagaStep(
"reserve_inventory",
lambda: inventory.reserve("product_123", 1),
lambda: inventory.release("product_123", 1)
))
orchestrator.add_step(SagaStep(
"charge_payment",
lambda: payment.charge("user_456", 99.99),
lambda: payment.refund("user_456", 99.99)
))
return orchestrator.execute()

Diagram
Aspect2PCSaga
ConsistencyStrong (ACID)Eventual
PerformanceSlow (blocking)Fast (non-blocking)
AvailabilityLow (blocking)High (non-blocking)
ComplexityMediumHigh (need compensation)
Use CasesCritical, short transactionsLong-running, distributed

Deep Dive: Production Considerations for Distributed Transactions

Section titled “Deep Dive: Production Considerations for Distributed Transactions”

2PC Performance:

  • Latency: 100-500ms per transaction (synchronous coordination)
  • Throughput: 100-1K transactions/sec (limited by coordinator)
  • Blocking: All participants blocked during prepare phase
  • Failure Recovery: 10-60 seconds (must query all participants)

Saga Performance:

  • Latency: 50-200ms per step (asynchronous execution)
  • Throughput: 1K-10K transactions/sec (no blocking)
  • Non-blocking: Steps execute independently
  • Failure Recovery: 1-5 seconds (compensate only completed steps)

Real-World Benchmark:

  • E-commerce checkout (2PC): ~300ms, 500 TPS
  • E-commerce checkout (Saga): ~150ms, 3K TPS
  • Improvement: Saga is 2x faster, 6x more throughput

Problem: What if compensation itself fails?

Example:

  1. Step 1: Reserve inventory ✅
  2. Step 2: Charge payment ✅
  3. Step 3: Ship order ❌ (fails)
  4. Compensate Step 2: Refund ❌ (fails!)

Solutions:

Solution 1: Retry Compensation

class SagaOrchestrator:
def compensate_with_retry(self, step, max_retries=3):
for attempt in range(max_retries):
try:
step.compensate()
return True
except Exception as e:
if attempt == max_retries - 1:
# Log for manual intervention
self.alert_manual_intervention(step, e)
return False
time.sleep(2 ** attempt) # Exponential backoff

Solution 2: Compensating Transaction Pattern

  • Store compensation intent in database
  • Background job retries failed compensations
  • Benefit: Eventually consistent compensation

Solution 3: Manual Intervention Queue

  • Failed compensations go to queue
  • Operations team handles manually
  • Use case: Critical financial transactions

Problem: Sagas that take hours/days (e.g., order fulfillment).

Challenges:

  • State persistence: Must survive server restarts
  • Timeout handling: Steps may timeout
  • Idempotency: Steps may be retried

Solution: Persistent Saga State

class PersistentSagaOrchestrator:
def __init__(self, db_connection):
self.db = db_connection
def execute_saga(self, saga_id, steps):
# Save saga state
self.db.save_saga_state(saga_id, {
'status': 'running',
'current_step': 0,
'completed_steps': [],
'compensated_steps': []
})
try:
for i, step in enumerate(steps):
# Load state (survives restarts)
state = self.db.load_saga_state(saga_id)
if i in state['completed_steps']:
continue # Already completed (idempotent)
# Execute step
if step.execute():
state['completed_steps'].append(i)
self.db.save_saga_state(saga_id, state)
else:
raise Exception(f"Step {i} failed")
except Exception as e:
# Compensate completed steps
self.compensate_saga(saga_id, state['completed_steps'])

Scenario 3: Saga Orchestration vs Choreography: Production Comparison

Section titled “Scenario 3: Saga Orchestration vs Choreography: Production Comparison”

Orchestration (Central Coordinator):

Pros:

  • Easier to understand: Centralized control flow
  • Better monitoring: Single point to observe
  • Easier debugging: All logic in one place
  • Transaction visibility: Can see entire saga state

Cons:

  • Single point of failure: Coordinator is critical
  • Scalability bottleneck: Coordinator can become overloaded
  • Tighter coupling: Services depend on orchestrator

Production Example: Netflix Conductor

  • Centralized orchestration engine
  • Handles millions of workflows
  • Scaling: Multiple coordinator instances with shared state

Choreography (Event-Driven):

Pros:

  • No single point of failure: Distributed control
  • Better scalability: No coordinator bottleneck
  • Looser coupling: Services communicate via events

Cons:

  • Harder to understand: Distributed control flow
  • Harder to monitor: Must trace events across services
  • Harder to debug: No single place to see state
  • Transaction visibility: Hard to see saga state

Production Example: Event Sourcing + CQRS

  • Events stored in event log
  • Services react to events
  • Monitoring: Event log provides audit trail

Hybrid Approach:

  • Use orchestration for critical, complex workflows
  • Use choreography for simple, high-volume workflows
  • Best of both worlds: Flexibility + visibility

Despite its problems, 2PC is still used in production:

Use Case 1: Database Clusters

  • PostgreSQL: Uses 2PC for distributed transactions
  • MySQL Cluster: Uses 2PC for multi-master replication
  • Why: Database-level coordination, not application-level

Use Case 2: XA Transactions

  • Java EE: JTA (Java Transaction API) uses 2PC
  • Spring: @Transactional with XA datasources
  • Why: Standard protocol, well-understood

Use Case 3: Short-Lived Transactions

  • Microsecond transactions: 2PC overhead acceptable
  • Low concurrency: Blocking not a problem
  • Why: Simpler than Saga for simple cases

Production Pattern:

// Use 2PC for simple, fast transactions
@Transactional(rollbackFor = Exception.class)
public void transferMoney(Account from, Account to, double amount) {
from.debit(amount);
to.credit(amount);
// 2PC handles coordination
}

Pattern: Retry failed step instead of compensating.

When to use:

  • Transient failures: Network hiccups, temporary unavailability
  • Idempotent operations: Safe to retry

Example:

class RetrySagaStep:
def execute(self, max_retries=3):
for attempt in range(max_retries):
try:
return self.do_work()
except TransientError as e:
if attempt == max_retries - 1:
raise # Give up after retries
time.sleep(2 ** attempt) # Exponential backoff

Pattern 2: Backward Recovery (Compensation)

Section titled “Pattern 2: Backward Recovery (Compensation)”

Pattern: Compensate completed steps (standard Saga).

When to use:

  • Permanent failures: Business logic errors
  • Non-idempotent operations: Can’t safely retry

Example: Already covered in main content


Pattern: Compensations expire after time limit.

Use case: Prevent stale compensations from running.

Example:

class TimedCompensation:
def __init__(self, ttl_seconds=3600):
self.ttl = ttl_seconds
self.created_at = time.time()
def compensate(self):
if time.time() - self.created_at > self.ttl:
raise CompensationExpired("Too much time has passed")
# Proceed with compensation

Pattern: Use idempotency keys to prevent duplicate execution.

Example:

class IdempotentSagaStep:
def execute(self, idempotency_key):
# Check if already executed
if self.db.exists(idempotency_key):
return self.db.get_result(idempotency_key)
# Execute and store result
result = self.do_work()
self.db.store(idempotency_key, result)
return result

Benefits:

  • ✅ Safe to retry
  • ✅ Prevents duplicate execution
  • ✅ Handles network retries

Pattern: Set timeouts for saga execution.

Example:

class TimedSaga:
def execute(self, timeout_seconds=300):
start_time = time.time()
for step in self.steps:
if time.time() - start_time > timeout_seconds:
raise SagaTimeout("Saga exceeded timeout")
step.execute()

Production Settings:

  • Short sagas: 30-60 seconds (API calls)
  • Medium sagas: 5-10 minutes (order processing)
  • Long sagas: Hours/days (fulfillment workflows)

What to Monitor:

  • Saga completion rate: Should be >99%
  • Average saga duration: Track P50, P95, P99
  • Compensation rate: High rate indicates problems
  • Failed sagas: Alert on failures

Production Metrics:

class SagaMetrics:
def record_saga(self, saga_id, duration, success):
self.metrics.increment('saga.total')
self.metrics.histogram('saga.duration', duration)
if success:
self.metrics.increment('saga.success')
else:
self.metrics.increment('saga.failure')
self.alert_on_failure(saga_id)

Performance Benchmarks: Real-World Numbers

Section titled “Performance Benchmarks: Real-World Numbers”
PatternLatencyThroughputFailure RecoveryUse Case
2PC100-500ms100-1K TPS10-60sDatabase clusters
Saga (Orchestration)50-200ms1K-10K TPS1-5sMicroservices
Saga (Choreography)30-150ms5K-50K TPS1-3sEvent-driven
Local Transaction5-20ms10K-100K TPSN/ASingle service

Key Insights:

  • Saga is 2-5x faster than 2PC
  • Choreography is faster than orchestration (no coordinator overhead)
  • Local transactions are 10x faster (no coordination needed)


Now that you understand distributed transactions, let’s explore how to handle conflicts when multiple operations happen simultaneously:

Next up: Conflict Resolution Strategies — Learn last-write-wins, vector clocks, and CRDTs for handling concurrent updates.