---
name: system-design
description: Structured approach to designing scalable distributed systems using Alex Xu methodology
version: 1.0.0
author: wondelai (adapted)
platforms: [claude-code, cursor]
license: MIT
source: https://github.com/wondelai/skills
based_on: "System Design Interview by Alex Xu"
---

# System Design

Apply a systematic framework for designing large-scale distributed systems.

## Scoring System
Rate the design 0-10 on scalability, reliability, and completeness.
Show: current score → gaps → improved design → trade-offs.

## The 4-Step Framework

### Step 1: Understand the Problem (5-10 min)
Ask clarifying questions before designing anything:
- **Scale**: DAU, QPS, data size, read/write ratio
- **Scope**: which features to include
- **Constraints**: latency requirements, consistency model, budget

**Back-of-envelope estimates:**
- 1 DAU = ~0.3 QPS average, ~3 QPS peak
- 1 server handles ~10K-100K QPS (depends on workload)
- SQL DB: ~1K-10K QPS; Redis: ~100K-1M QPS
- 1 image ≈ 300KB; 1 min video ≈ 50MB

### Step 2: High-Level Design
Draw the major components:
```
Client → CDN → Load Balancer → API Gateway → Services → Database
                                    ↓
                              Message Queue → Workers → Object Storage
```

### Step 3: Deep Dive
Focus on the hard parts:
- **Database**: SQL vs NoSQL, sharding strategy, replication
- **Caching**: where (CDN, Redis, in-memory), TTL, eviction policy
- **API Design**: REST vs GraphQL vs gRPC
- **Consistency**: strong vs eventual, CAP theorem implications

### Step 4: Wrap Up — Handle Edge Cases
- Single points of failure
- Hotspots (celebrity problem)
- Geographic distribution
- Monitoring and alerting
- Graceful degradation

## Key Patterns

| Problem | Pattern |
|---------|---------|
| High read load | Read replicas + cache |
| Large data set | Horizontal sharding |
| Multiple services | API Gateway |
| Async processing | Message queue (Kafka/RabbitMQ) |
| Static content | CDN |
| Real-time | WebSockets or Server-Sent Events |
| Global users | Multi-region + GeoDNS |

## Numbers to Know
- L1 cache: 0.5ns; L2: 7ns; RAM: 100ns; SSD: 150μs; Network: 150ms
- Throughput: SSD 500MB/s; Network 1Gbps
- Storage cost: $0.023/GB/month (S3), $0.10/GB/month (SSD)