Distributed Cache

Scale application performance with distributed caching. Learn about caching topologies, Cache-Aside, Write-Through, Write-Behind, and mitigating Cache Penetration, Avalanche, and Breakdown.

Distributed Caching

Caching is the process of storing frequently accessed data in a temporary, high-speed memory layer (typically RAM) to serve reads faster and reduce load on backend databases. While local caching stores data within a single server instance, a Distributed Cache provides a shared, centralized cache layer accessible by all application instances.


1. Caching Topologies: Local vs. Distributed

How you position your cache layer determines how easily you can scale:

  • Local Cache: Session or app state stored in the application's RAM (e.g., node-cache, Java Guava).
    • Downside: Inconsistent data. If App Server A updates a user record, Server B doesn't know, leading to inconsistent views.
  • Distributed Cache: A dedicated cache cluster (e.g., Redis, Memcached) accessible via network calls.
    • Upside: Data consistency. Every application instance queries the same cache, making horizontal scaling simple.

2. Caching Strategies (Data Flow Patterns)

How data is synchronized between the client, cache, and database depends on your read-to-write ratio and consistency requirements:

1. Cache-Aside (Lazy Loading)

The application handles all data operations. It queries the cache first. If a cache miss occurs, the application fetches data from the database, writes it to the cache, and returns it.

  • Best For: Read-heavy workloads.
  • Pros: Only caches requested data; database failures do not completely crash writes.
  • Cons: Cache miss penalty (two network hops on miss); data can become stale if database updates bypass the cache.

2. Write-Through

The application writes data to the cache and the database in a single transaction. The write is only considered successful when both operations complete.

  • Best For: Applications requiring strong consistency between cache and database.
  • Pros: Cached data is never stale.
  • Cons: High write latency; write-once data floods the cache.

3. Write-Behind (Write-Back)

The application writes data directly to the cache. The cache layer immediately acknowledges the write. A background process asynchronously pushes these updates to the database.

  • Best For: Write-heavy workloads (e.g., streaming logs, real-time counters).
  • Pros: Low write latency; buffers database writes.
  • Cons: Risk of data loss if the cache server crashes before the background write completes.

3. High-Scale Caching Disasters & Mitigations

Under heavy load, simple caching setups can experience failures that can cascade and take down backend databases.

1. Cache Penetration

An attacker queries keys that do not exist in either the cache or the database (e.g., GET user:invalid-id-9999). Every query results in a cache miss, forwarding all requests to the database, eventually overloading it.

  • Mitigation 1 (Cache Nulls): If a key does not exist in the database, write a placeholder value (e.g., null or "{}") with a short TTL (e.g., 5 minutes) to the cache.
  • Mitigation 2 (Bloom Filters): Position an in-memory Bloom Filter before the cache. The filter quickly rejects requests for keys that do not exist in the dataset.

2. Cache Avalanche

A large number of cached keys expire at the exact same time, or the cache cluster itself crashes. This causes all subsequent queries to hit the database simultaneously, causing a database crash.

  • Mitigation 1 (Randomized TTL): Add a random jitter to the TTL of every key when writing (e.g., instead of 1 hour, set it to $1 \text \pm \text(1\text10\text)$).
  • Mitigation 2 (High-Availability Cache): Run Redis in Sentinel or Cluster mode with primary-replica replication to survive node failures.

3. Cache Breakdown (Cache Stampede)

A highly popular "hot key" (e.g., a breaking news article) expires. Because the key is queried thousands of times per second, the cache miss triggers duplicate queries to the database before the first query can write the result back to the cache.

  • Mitigation (Mutex Locks): When a cache miss occurs, the first thread acquires a distributed mutex lock (e.g., Redis SETNX) to query the database. Other threads wait for the lock to release or read the cached value once populated, preventing duplicate database hits.
Code
async function getHotKeyWithLock<T>(key: string, fetchDbFn: () => Promise<T>, ttl: number): Promise<T> {
    const cached = await redis.get(key);
    if (cached) return JSON.parse(cached);
 
    const lockKey = `lock:${key}`;
    const acquiredLock = await redis.set(lockKey, "1", "EX", 10, "NX");
 
    if (acquiredLock) {
        // Only one process queries the database
        const data = await fetchDbFn();
        await redis.setex(key, ttl, JSON.stringify(data));
        await redis.del(lockKey);
        return data;
    } else {
        // Wait and retry
        await sleep(100);
        return getHotKeyWithLock(key, fetchDbFn, ttl);
    }
}

4. Redis vs. Memcached

When selecting a distributed caching engine, these are the core differences to consider:

FeatureRedisMemcached
Data StructuresRich (Strings, Lists, Hashes, Sets, Sorted Sets, HyperLogLogs, Geospatial).Simple (Strings only - key/value blobs).
Threading ModelSingle-threaded event loop (uses multi-threading only for disk I/O in newer versions).Multi-threaded (scales easily with more CPU cores).
PersistenceSupported (RDB snapshots, AOF append logs). Can act as a database.No persistence (in-memory only).
Eviction AlgorithmsLRU, LFU, Random, TTL.LRU only.
ReplicationBuilt-in (Master-Replica clusters).Requires client-side routing logic.