Skip to content
Low Level Design Mastery Logo
LowLevelDesign Mastery

Scaling Databases

Making databases handle more load

As your application grows, your database becomes a bottleneck. More users mean more queries, and a single database server can only handle so much.

Diagram

Vertical scaling means making your existing server bigger—more CPU, more RAM, faster disks.

Diagram

Pros:

  • ✅ Simple (just upgrade hardware)
  • ✅ No code changes needed
  • ✅ Works immediately

Cons:

  • ❌ Expensive (bigger servers cost more)
  • ❌ Hardware limits (can’t scale infinitely)
  • ❌ Single point of failure

Horizontal scaling means adding more servers to share the load.

Diagram

Pros:

  • ✅ Can scale infinitely (add more servers)
  • ✅ Cost-effective (cheaper servers)
  • ✅ High availability (if one fails, others continue)

Cons:

  • ❌ More complex (need replication, load balancing)
  • ❌ Requires code changes
  • ❌ Data consistency challenges

Read replicas are copies of the primary database that handle read operations. They replicate data from the primary and distribute read load.

Diagram

Key Characteristics:

  • Primary handles all writes
  • Replicas handle reads
  • Replication is asynchronous (eventual consistency)
  • Load distribution across multiple servers

Read-write splitting routes writes to the primary and reads to replicas.

Diagram

Benefits:

  • ✅ Distributes read load
  • ✅ Primary focuses on writes
  • ✅ Better overall performance

Replication lag is the delay between a write on the primary and when it appears on replicas.

Diagram

Problem: Users might read stale data immediately after writing.

Solution: Read-after-write consistency—read from primary for a short time after writing.


Connection pooling maintains a cache of database connections that can be reused, reducing connection overhead.

Diagram

Connection overhead:

  • Creating connection: ~20ms
  • Authentication: ~10ms
  • Closing connection: ~5ms
  • Total overhead: ~35ms per query

With pooling: Reuse existing connections, overhead: ~1ms


Diagram

Pool Management:

  • Min pool size: Keep minimum connections ready
  • Max pool size: Maximum concurrent connections
  • Idle timeout: Close idle connections after timeout
  • Connection timeout: Wait time if pool is full

How database scaling affects your class design:

Connection Pool
from queue import Queue
from threading import Lock
import psycopg2
from contextlib import contextmanager
class ConnectionPool:
def __init__(self, connection_string, min_size=5, max_size=20):
self.connection_string = connection_string
self.min_size = min_size
self.max_size = max_size
self.pool = Queue(maxsize=max_size)
self.lock = Lock()
self.active_connections = 0
# Initialize pool with min connections
for _ in range(min_size):
conn = psycopg2.connect(connection_string)
self.pool.put(conn)
@contextmanager
def get_connection(self):
conn = None
try:
# Get connection from pool or create new one
if not self.pool.empty():
conn = self.pool.get_nowait()
elif self.active_connections < self.max_size:
conn = psycopg2.connect(self.connection_string)
self.active_connections += 1
else:
# Wait for available connection
conn = self.pool.get(timeout=5)
yield conn
finally:
# Return connection to pool
if conn:
self.pool.put(conn)
def close_all(self):
while not self.pool.empty():
conn = self.pool.get()
conn.close()
Read-Write Splitting
class ReadWriteRouter:
def __init__(self, primary_pool, replica_pools):
self.primary_pool = primary_pool
self.replica_pools = replica_pools
self.replica_index = 0
def execute(self, query, params=None, is_write=False):
# Route writes to primary, reads to replicas
if is_write or self._is_write_query(query):
pool = self.primary_pool
else:
# Round-robin across replicas
pool = self.replica_pools[self.replica_index]
self.replica_index = (self.replica_index + 1) % len(self.replica_pools)
with pool.get_connection() as conn:
cursor = conn.cursor()
cursor.execute(query, params)
if is_write or self._is_write_query(query):
conn.commit()
return cursor.rowcount
else:
return cursor.fetchall()
def _is_write_query(self, query):
query_upper = query.strip().upper()
return query_upper.startswith(('INSERT', 'UPDATE', 'DELETE', 'CREATE', 'DROP', 'ALTER'))


Now that you understand scaling, let’s explore sharding and partitioning to distribute data across multiple databases:

Next up: Sharding & Partitioning — Learn horizontal vs vertical partitioning and how to implement shard-aware repositories.