Scaling Databases

Making databases handle more load

Why Scale Databases?

As your application grows, your database becomes a bottleneck. More users mean more queries, and a single database server can only handle so much.

Scaling Strategies

Vertical Scaling (Scale Up)

Vertical scaling means making your existing server bigger—more CPU, more RAM, faster disks.

Pros:

Simple (just upgrade hardware)
No code changes needed
Works immediately

Cons:

Expensive (bigger servers cost more)
Hardware limits (can’t scale infinitely)
Single point of failure

Horizontal Scaling (Scale Out)

Horizontal scaling means adding more servers to share the load.

Pros:

Can scale infinitely (add more servers)
Cost-effective (cheaper servers)
High availability (if one fails, others continue)

Cons:

More complex (need replication, load balancing)
Requires code changes
Data consistency challenges

Real-World Examples

Major companies use different scaling strategies based on their needs:

Scale Up: Small to Medium Applications

The Challenge: A growing SaaS application needs more database capacity but wants to keep architecture simple.

The Solution: Many startups use scale-up initially:

Start: Single database server (4 CPU, 16GB RAM)
Growth: Upgrade to larger server (16 CPU, 64GB RAM)
Benefit: Simple, no code changes, immediate improvement

Example: A B2B SaaS application:

Year 1: 1,000 users → Single server handles load
Year 2: 10,000 users → Upgrade to 4x larger server
Year 3: 50,000 users → Upgrade to 16x larger server

Impact: Handled 50x growth with scale-up. Simple operations. Cost-effective for moderate scale.

Limitation: Eventually hits hardware limits. At 100,000+ users, need scale-out.

Scale Out: Large-Scale Applications

The Challenge: Large applications need to scale beyond single server limits. Need high availability and fault tolerance.

The Solution: Major platforms use scale-out:

Facebook: Thousands of MySQL servers, read replicas, sharding
Amazon: Distributed databases, read replicas across regions
Netflix: Multiple database clusters, read replicas for global access

Example: E-commerce platform scaling:

Start: Single database (handles 1,000 orders/day)
Growth: Add read replicas (handles 10,000 orders/day)
Scale: Shard database (handles 1,000,000 orders/day)

Impact: Scales to millions of users. High availability (survives server failures). Global distribution.

Read Replicas: Instagram’s Architecture

The Challenge: Instagram serves billions of photo views daily. Most operations are reads (viewing photos, profiles, feeds).

The Solution: Instagram uses extensive read replicas:

Primary: Handles writes (photo uploads, comments, likes)
Replicas: Handle reads (photo views, profile views, feed generation)
Scale: 100+ read replicas worldwide

Why Read Replicas?

99% of operations are reads
Read replicas distribute load
Primary focuses on writes

Impact: Handles billions of reads daily. Primary not overloaded. Fast read performance globally.

Connection Pooling: High-Traffic APIs

The Challenge: APIs handling thousands of requests per second. Creating database connections is expensive (50ms per connection).

The Solution: Major APIs use connection pooling:

GitHub: Connection pool of 100 connections, reused across requests
Stripe: Connection pool per API server, handles millions of requests
Twitter: Connection pooling essential for high throughput

Example: API handling 10,000 requests/second:

Without pool: 10,000 connections × 50ms = 500 seconds overhead
With pool: 100 connections reused, 5ms per request = 50 seconds overhead
10x improvement

Impact: Reduced connection overhead by 90%. Handles 10x more requests. Lower latency.

Read-Write Splitting: E-commerce Platforms

The Challenge: E-commerce platforms have read-heavy workloads (product browsing) and write operations (orders, inventory updates).

The Solution: E-commerce platforms use read-write splitting:

Writes: Go to primary (orders, inventory updates)
Reads: Go to replicas (product catalog, search, recommendations)

Example: Product page load:

Read: Product data from replica (fast, doesn’t block writes)
Write: Order creation goes to primary (ensures consistency)

Impact: Product pages load faster. Order processing not blocked by reads. Better user experience.

Read Replicas

Read replicas are copies of the primary database that handle read operations. They replicate data from the primary and distribute read load.

How Read Replicas Work

Key Characteristics:

Primary handles all writes
Replicas handle reads
Replication is asynchronous (eventual consistency)
Load distribution across multiple servers

Read-Write Splitting

Read-write splitting routes writes to the primary and reads to replicas.

Benefits:

Distributes read load
Primary focuses on writes
Better overall performance

Replication Lag

Replication lag is the delay between a write on the primary and when it appears on replicas.

Problem: Users might read stale data immediately after writing.

Solution: Read-after-write consistency—read from primary for a short time after writing.

Connection Pooling

Connection pooling maintains a cache of database connections that can be reused, reducing connection overhead.

The Problem: Connection Overhead

Connection overhead:

Creating connection: ~20ms
Authentication: ~10ms
Closing connection: ~5ms
Total overhead: ~35ms per query

With pooling: Reuse existing connections, overhead: ~1ms

How Connection Pooling Works

Pool Management:

Min pool size: Keep minimum connections ready
Max pool size: Maximum concurrent connections
Idle timeout: Close idle connections after timeout
Connection timeout: Wait time if pool is full

LLD ↔ HLD Connection

How database scaling affects your class design:

Connection Pool Implementation

Read-Write Splitting Implementation

Key Takeaways

What’s Next?

Now that you understand scaling, let’s explore sharding and partitioning to distribute data across multiple databases:

Next up: Sharding & Partitioning — Learn horizontal vs vertical partitioning and how to implement shard-aware repositories.

Request a feature or report an issue