Skip to content
Low Level Design Mastery Logo
LowLevelDesign Mastery

Why System Design Matters

From single class to global scale

You’ve written a beautiful class. It’s well-designed, follows SOLID principles, and has great test coverage. But software doesn’t run in isolation—it runs on servers, handles thousands of users, and must work 24/7.

Diagram

System design is the process of defining the architecture, components, and data flow of a system to meet specific requirements. It’s about making decisions that affect:

  • How your code runs - On one server or thousands?
  • How data flows - Synchronous or asynchronous?
  • How failures are handled - What happens when things break?
  • How the system scales - Can it handle 10x more users?
Diagram
AspectHigh-Level Design (HLD)Low-Level Design (LLD)
FocusSystem architectureClass structure
ScopeMultiple servicesSingle service/module
ArtifactsArchitecture diagramsClass diagrams
DecisionsWhich database? How many servers?Which pattern? What interface?
ScaleMillions of usersThousands of objects

Every class you write will eventually run in a system with:

order_service.py
class OrderService:
"""Looks simple, but consider the system context..."""
def __init__(self, db: Database, payment: PaymentGateway, inventory: InventoryService):
self.db = db # Which database? Replicated? Sharded?
self.payment = payment # External API - what if it's slow?
self.inventory = inventory # Another service - what if it's down?
def place_order(self, order: Order) -> OrderResult:
# What if this takes 30 seconds?
# What if 1000 users call this simultaneously?
# What if the database is in another data center?
self.inventory.reserve(order.items) # Network call #1
payment_result = self.payment.charge(order.total) # Network call #2
self.db.save(order) # Network call #3
return OrderResult(success=True, order_id=order.id)

2. Design Decisions Have System Implications

Section titled “2. Design Decisions Have System Implications”

Every LLD decision affects the system:

LLD DecisionSystem Implication
Using Singleton patternWon’t work across multiple servers
Storing state in instance variablesCan’t scale horizontally
Synchronous method callsCreates coupling, blocks resources
In-memory cachingEach server has different cache
Auto-increment IDsConflicts in distributed databases

In senior engineering interviews, expect questions like:

Diagram

Every system design discussion involves these key concerns:

Can the system handle growth?

Diagram

LLD Impact: Design classes that can work in a distributed environment. Avoid global state, use dependency injection, make components stateless where possible.

Does the system work correctly, even when things fail?

  • Hardware fails (servers crash, disks die)
  • Software has bugs
  • Networks are unreliable
  • Users make mistakes

LLD Impact: Implement proper error handling, use retry patterns, design for idempotency.

Is the system accessible when users need it?

  • 99.9% uptime = 8.76 hours downtime/year
  • 99.99% uptime = 52.6 minutes downtime/year
  • 99.999% uptime = 5.26 minutes downtime/year

LLD Impact: Design classes with fallback behaviors, implement circuit breakers, handle graceful degradation.

Can the system be easily modified and operated?

  • New features can be added
  • Bugs can be fixed quickly
  • Operations are simple
  • System is observable

LLD Impact: Follow SOLID principles, write clean code, use design patterns appropriately.

Does the system respond quickly and efficiently?

  • Low latency (fast responses)
  • High throughput (many requests)
  • Efficient resource usage

LLD Impact: Choose appropriate data structures, optimize algorithms, minimize unnecessary operations.


Let’s see how system thinking changes a simple class design:

naive_counter.py
class PageViewCounter:
"""Simple counter - works perfectly on one server"""
def __init__(self):
self.counts = {} # page_id -> count
def increment(self, page_id: str) -> int:
if page_id not in self.counts:
self.counts[page_id] = 0
self.counts[page_id] += 1
return self.counts[page_id]
def get_count(self, page_id: str) -> int:
return self.counts.get(page_id, 0)
# Usage
counter = PageViewCounter()
counter.increment("homepage") # 1
counter.increment("homepage") # 2

Problems with this design:

  • ❌ Data lost if server restarts
  • ❌ Different counts on each server
  • ❌ No persistence
  • ❌ Memory grows unbounded

The key insight is to externalize state to a shared store that all servers can access. This requires:

  1. Abstraction - Define an interface for storage (Dependency Inversion Principle)
  2. Shared State - Use Redis, a database, or similar shared storage
  3. Atomic Operations - Use Redis’s INCR command which is atomic
distributed_counter.py
class PageViewCounter:
"""Counter that works in distributed systems"""
def __init__(self, redis_client):
self.redis = redis_client # External shared state
def increment(self, page_id: str) -> int:
return self.redis.incr(f"pageview:{page_id}") # Atomic operation
def get_count(self, page_id: str) -> int:
return int(self.redis.get(f"pageview:{page_id}") or 0)
# Now works across all servers!
counter = PageViewCounter(redis.Redis(host='redis-cluster'))
counter.increment("homepage")

What changed and why:

ChangeSystem Design Reason
Added CounterStorage interfaceDecouples from specific storage (DIP)
Used Redis instead of in-memoryShared state across servers
Dependency injectionTestable, flexible, swappable
Atomic operations (INCR)Handles concurrent requests


Now that you understand why system design matters, let’s dive into the first fundamental concept:

Next up: Scalability Fundamentals - Learn how systems grow and the strategies to handle that growth.