Skip to content
Low Level Design Mastery Logo
LowLevelDesign Mastery

Scalability Fundamentals

Building systems that grow with your success

Scalability is a system’s ability to handle increased load by adding resources. A scalable system can grow to accommodate more users, more data, or more transactions without degrading performance.

Diagram

Add more power to existing machines - bigger CPU, more RAM, faster disks.

Diagram

Pros:

  • ✅ Simple - no code changes needed
  • ✅ No distributed system complexity
  • ✅ Strong consistency is easy

Cons:

  • ❌ Hardware limits (can’t add infinite CPU)
  • ❌ Expensive at high end
  • ❌ Single point of failure
  • ❌ Downtime during upgrades

Add more machines - distribute the load across multiple servers.

Diagram

Pros:

  • ✅ No hardware limits (add infinite machines)
  • ✅ Cost-effective (use commodity hardware)
  • ✅ Built-in redundancy
  • ✅ Gradual scaling

Cons:

  • ❌ Distributed system complexity
  • ❌ Data consistency challenges
  • ❌ Code must be designed for it
  • ❌ Network overhead

AspectVertical ScalingHorizontal Scaling
ApproachBigger machineMore machines
LimitHardware ceilingTheoretically unlimited
CostExpensive at scaleCost-effective
ComplexitySimpleComplex
DowntimeRequired for upgradesZero-downtime possible
FailureSingle point of failureRedundancy built-in
Code changesUsually noneMay require redesign

This is where LLD meets HLD. Your class design determines whether your system can scale horizontally.

stateful_service.py
class ShoppingCartService:
"""❌ NOT horizontally scalable - stores state in memory"""
def __init__(self):
self.carts = {} # user_id -> cart items
def add_item(self, user_id: str, item: str):
if user_id not in self.carts:
self.carts[user_id] = []
self.carts[user_id].append(item)
def get_cart(self, user_id: str) -> list:
return self.carts.get(user_id, [])
# Problem: If user's next request goes to a different server,
# their cart is empty!
Diagram

The solution is to externalize all state to a shared store (Redis, database, etc.):

  • No instance variables holding user/session data
  • All state lives externally in Redis, database, or similar
  • Any server can handle any request because they all access the same shared state
stateless_service.py
class ShoppingCartService:
"""✅ Horizontally scalable - no local state"""
def __init__(self, redis_client):
self.redis = redis_client # External storage
def add_item(self, user_id: str, item: str) -> None:
self.redis.rpush(f"cart:{user_id}", item)
def get_cart(self, user_id: str) -> list:
return self.redis.lrange(f"cart:{user_id}", 0, -1)
# Any server can handle any request!
Diagram

Systems can be scaled along different dimensions:

Handle more requests per second

Diagram

Handle more data

Diagram

Serve users globally with low latency

Diagram

Key Principle: All state lives externally (database, cache, message queue). The service itself stores nothing between requests.

Diagram

This allows you to run any number of service instances and route requests to any of them.

Key Principle: Operations that can be safely retried without side effects. Critical for distributed systems where network failures cause retries.

Diagram

Implementation: Store results keyed by a unique idempotency key. Before processing, check if the key exists and return the cached result.

Key Principle: Move slow, non-critical operations out of the request path using message queues.

Diagram

Result: Users get fast responses. Slow operations (email, analytics, notifications) happen in the background without blocking.


When designing classes, ask yourself:

QuestionWhy It Matters
Does this class store state in instance variables?Prevents horizontal scaling
Can multiple instances run simultaneously?Required for scaling out
Are operations idempotent?Enables safe retries
What happens if this operation is slow?May need async processing
Does this depend on local resources (files, memory)?Won’t work across servers
How does this handle concurrent requests?Thread safety concerns


Understanding scalability is just the beginning. Next, we’ll dive into measuring system performance:

Next up: Latency and Throughput - Learn the key metrics that define system performance.