CDN
Designing Content Delivery Networks for optimal performance
Overview
A Content Delivery Network (CDN) is a geographically distributed system of cache servers that delivers web content to users with high availability and low latency. CDNs reduce load on origin servers, decrease latency by serving content from edge locations close to users, and improve reliability through replication and routing.
Why use a CDN?
- Reduced latency: Content is served from edge servers geographically closer to users
- Lower origin load: Static assets are offloaded from your origin servers
- High availability: Distributed architecture provides redundancy and fault tolerance
- DDoS protection: CDNs can absorb and mitigate large-scale attacks
- Bandwidth savings: Caching reduces repeated data transfer from origin
How CDNs work
User Request → DNS Resolution → Nearest Edge Server → Cache Hit/Miss → Response
↓
Cache Miss
↓
Origin Server
- DNS routing: User's DNS request is resolved to the nearest edge server using GeoDNS or Anycast
- Edge caching: CDN caches static assets at edge servers in multiple regions
- Cache miss flow: On a cache miss, the edge fetches from origin, caches it, then returns to user
- Request routing: CDNs use DNS-based routing, Anycast, or application-layer routing
Types of content served
| Content Type | Examples | Cacheability |
|---|---|---|
| Static assets | Images, CSS, JS, fonts | Highly cacheable |
| Media | Video/audio (HLS/DASH segments) | Highly cacheable |
| API responses | GET requests, public data | Conditionally cacheable |
| Dynamic content | Personalized pages | Edge compute or no-cache |
CDN architectures
Push CDN
Content is explicitly uploaded to the CDN before users request it.
Best for: Video platforms, software distribution, large media files
Origin Server ---(upload)---> CDN Storage ---> Edge Servers ---> Users
Pull CDN (Origin-Pull)
CDN fetches content from origin on first request and caches it.
Best for: Websites, APIs, frequently changing content
User ---> Edge Server ---(cache miss)---> Origin Server
↓
(cache & serve)
↓
User
Multi-CDN
Using multiple CDN providers for redundancy, performance, or cost optimization.
┌─── CDN A (Primary)
Traffic ─┼─── CDN B (Failover)
└─── CDN C (Regional)
Caching strategies
Cache-Control headers
Cache-Control: public, max-age=31536000, immutable
Cache-Control: private, no-cache
Cache-Control: public, max-age=3600, stale-while-revalidate=86400Key caching concepts
- TTL (Time-to-Live): How long content stays cached before revalidation
- Cache invalidation/Purge: Force removal of cached content when origin changes
- Stale-while-revalidate: Serve stale content while fetching fresh copy in background
- Stale-if-error: Serve stale content if origin is unavailable
- Cache hierarchy: Edge → Regional → Origin (reduces origin load)
Cache key design
Cache Key = URL + Vary Headers + Query Params (selective)
Example:
/api/products?category=shoes
Vary: Accept-Language, Accept-Encoding
Performance optimizations
- Edge proximity: Serve from servers closest to users (reduces RTT)
- Connection reuse: Keep-alive connections, HTTP/2 multiplexing
- Protocol optimizations: HTTP/2, HTTP/3 (QUIC), TLS 1.3
- Compression: Gzip, Brotli at the edge
- Image optimization: WebP/AVIF conversion, resizing at edge
- Prefetching: Predictive content loading
Consistency considerations
CDNs trade immediate consistency for availability and performance.
| Strategy | Consistency | Performance |
|---|---|---|
| Short TTL (seconds) | Higher | Lower cache hits |
| Long TTL + Purge | Lower | Higher cache hits |
| Versioned URLs | Immediate | Best cache hits |
| Cache tags | Selective | Good balance |
Best practice: Use versioned/fingerprinted URLs for assets:
/static/app.a1b2c3d4.js ← Cache forever, new version = new URL
Security features
- DDoS mitigation: Absorb volumetric attacks across distributed edge
- TLS termination: Terminate TLS at edge, reduce origin CPU
- WAF (Web Application Firewall): Filter malicious requests at edge
- Bot management: Detect and block automated threats
- Origin shielding: Hide origin IP, reduce direct attacks
- Token authentication: Signed URLs for protected content
Operational considerations
Key metrics to monitor
- Cache hit ratio: Target > 90% for static assets
- Origin traffic: Should be minimal with good caching
- Latency (p50, p95, p99): Edge latency should be < 50ms
- Error rates: 4xx/5xx from edge and origin
- Bandwidth: Egress costs and capacity
Cost optimization
- Maximize cache hit ratio (reduces origin egress)
- Use appropriate TTLs (longer = cheaper)
- Compress assets (reduces bandwidth)
- Consider reserved capacity for predictable traffic
Example: CDN for a web application
┌─────────────────────────────────────────────────────────────┐
│ Internet │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ CDN Edge Layer │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ Edge US │ │ Edge EU │ │Edge Asia│ │Edge SA │ │
│ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │
│ Static assets, API cache, WAF, DDoS protection │
└─────────────────────────────────────────────────────────────┘
│
▼ (cache miss only)
┌─────────────────────────────────────────────────────────────┐
│ Origin Infrastructure │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Web Servers │ │ API Servers │ │ Object Store │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────┘
Popular CDN providers
| Provider | Strengths |
|---|---|
| Cloudflare | DDoS, WAF, Workers (edge compute), free tier |
| AWS CloudFront | AWS integration, Lambda@Edge |
| Fastly | Real-time purging, VCL customization |
| Akamai | Enterprise scale, global reach |
| Google Cloud CDN | GCP integration, global load balancing |
Further reading
- High Performance Browser Networking — Ilya Grigorik
- CDN provider documentation (Cloudflare, Fastly, AWS CloudFront)
- HTTP Caching (MDN Web Docs)