Back to blog

Redis vs Memcached vs Local Cache: Choosing the Right Caching Solution

rediscachingbackendperformanceinfrastructure
Redis vs Memcached vs Local Cache: Choosing the Right Caching Solution

Introduction

Your application is slow. Database queries take 50ms each. You're making the same queries thousands of times per second. The answer is obvious: caching.

But which caching solution should you use?

The three most common options are:

SolutionTypeExample Tools
Local CacheIn-process memoryGuava Cache, Caffeine, lru-cache, Map
RedisDistributed in-memory storeRedis, Valkey, KeyDB
MemcachedDistributed memory cacheMemcached

Each has strengths. Each has trade-offs. And the best systems often use more than one. This guide will help you understand when to use each — and when to combine them.

What You'll Learn

✅ How local cache, Redis, and Memcached work under the hood
✅ Performance characteristics and latency comparisons
✅ When to choose each solution (with real-world scenarios)
✅ Multi-tier caching architecture patterns
✅ Practical implementation examples in Node.js, Python, and Java
✅ Cache invalidation strategies for each layer
✅ Common pitfalls and how to avoid them

Prerequisites


Part 1: Understanding the Three Options

Local Cache (In-Process)

Local cache lives inside your application process. It stores data in the application's own memory (heap), making it the fastest possible cache — no network calls, no serialization.

How it works:

// Node.js — simple in-memory cache with TTL
const cache = new Map<string, { value: unknown; expiresAt: number }>();
 
function localGet(key: string): unknown | null {
  const entry = cache.get(key);
  if (!entry) return null;
  if (Date.now() > entry.expiresAt) {
    cache.delete(key);
    return null;
  }
  return entry.value;
}
 
function localSet(key: string, value: unknown, ttlMs: number): void {
  cache.set(key, { value, expiresAt: Date.now() + ttlMs });
}
 
// Usage
localSet("user:123", { name: "Alice", role: "admin" }, 60_000); // 60s TTL
const user = localGet("user:123"); // ~microseconds, no network hop

Production-grade alternatives:

// Java — Caffeine (successor to Guava Cache)
LoadingCache<String, User> cache = Caffeine.newBuilder()
    .maximumSize(10_000)
    .expireAfterWrite(Duration.ofMinutes(5))
    .recordStats() // Enable hit/miss metrics
    .build(key -> userRepository.findById(key));
 
User user = cache.get("user:123"); // Auto-loads from DB on miss
# Python — cachetools
from cachetools import TTLCache
 
cache = TTLCache(maxsize=10_000, ttl=300)  # 300s TTL
 
def get_user(user_id: str) -> dict:
    if user_id in cache:
        return cache[user_id]
    user = db.find_user(user_id)
    cache[user_id] = user
    return user

Strengths:

  • Fastest possible: No network latency, no serialization (~1μs vs ~1ms for Redis)
  • Simplest to set up: No external infrastructure required
  • Zero operational overhead: No servers to manage, no connection pools

Weaknesses:

  • Not shared: Each app instance has its own cache — duplicated data, inconsistent state
  • Memory-limited: Competes with your application for heap space
  • Lost on restart: Cache is gone when the process dies
  • No built-in eviction policies (unless using a library like Caffeine)

Redis

Redis is a distributed, in-memory data structure store. It runs as a separate process (usually on a dedicated server) and all app instances connect to it over the network.

How it works:

// Node.js — ioredis
import Redis from "ioredis";
 
const redis = new Redis({ host: "redis.internal", port: 6379 });
 
// Simple string cache
await redis.set("user:123", JSON.stringify(user), "EX", 300); // 5 min TTL
const cached = await redis.get("user:123");
const user = cached ? JSON.parse(cached) : null;
 
// Hash — store structured data without serialization overhead
await redis.hset("user:123", { name: "Alice", role: "admin", loginCount: "42" });
const name = await redis.hget("user:123", "name"); // Get single field
# Python — redis-py
import redis
import json
 
r = redis.Redis(host="redis.internal", port=6379, decode_responses=True)
 
# Cache with TTL
r.setex("user:123", 300, json.dumps({"name": "Alice", "role": "admin"}))
user = json.loads(r.get("user:123"))
 
# Atomic counter — no race conditions
r.incr("api:rate:192.168.1.1")  # Thread-safe increment
r.expire("api:rate:192.168.1.1", 60)  # Reset every 60s

Strengths:

  • Rich data structures: Strings, Hashes, Lists, Sets, Sorted Sets, Streams, HyperLogLog
  • Shared across instances: All app servers see the same cache
  • Persistence options: RDB snapshots and AOF logging survive restarts
  • Atomic operations: INCR, LPUSH, SADD — no race conditions
  • Pub/Sub: Built-in messaging for cache invalidation events
  • Lua scripting: Complex operations in a single round trip
  • Replication & Clustering: Built-in high availability

Weaknesses:

  • Network latency: ~1ms round trip vs ~1μs for local cache
  • Serialization cost: Must convert objects to strings/bytes and back
  • Operational complexity: Another service to deploy, monitor, and maintain
  • Single-threaded (mostly): One CPU core handles all commands (I/O threads added in Redis 6+)
  • Memory cost: RAM is expensive at scale

Memcached

Memcached is a distributed memory caching system designed for one thing: caching. It's simpler than Redis by design — a pure key-value store with no data structures, no persistence, no scripting.

// Node.js — memjs
import memjs from "memjs";
 
const mc = memjs.Client.create("memcached1:11211,memcached2:11211");
 
await mc.set("user:123", JSON.stringify(user), { expires: 300 });
const { value } = await mc.get("user:123");
const user = value ? JSON.parse(value.toString()) : null;
# Python — pymemcache
from pymemcache.client.hash import HashClient
 
client = HashClient([
    ("memcached1", 11211),
    ("memcached2", 11211),
])
 
client.set("user:123", json.dumps(user), expire=300)
cached = client.get("user:123")
user = json.loads(cached) if cached else None

Strengths:

  • Multi-threaded: Uses all CPU cores (unlike Redis's single-threaded model)
  • Predictable memory usage: Slab allocator prevents fragmentation
  • Simple and battle-tested: Fewer moving parts = fewer surprises
  • Horizontal scaling: Consistent hashing across multiple nodes
  • Lower memory overhead per key: No data structure metadata

Weaknesses:

  • Only strings: No hashes, lists, sets, or sorted sets
  • No persistence: Data is gone when the process restarts
  • No replication: No built-in master-replica failover
  • No Pub/Sub: No event notifications for cache changes
  • 1MB value limit: Large objects must be chunked manually
  • No Lua scripting: No complex atomic operations

Part 2: Head-to-Head Comparison

Feature Comparison

FeatureLocal CacheRedisMemcached
Latency~1μs~0.5-1ms~0.5-1ms
Data StructuresLanguage-nativeRich (Strings, Hash, List, Set, ZSet, Stream)Strings only
Shared Across Instances❌ No✅ Yes✅ Yes
Persistence❌ No✅ RDB + AOF❌ No
Max Value SizeHeap limit512MB1MB (default)
ThreadingApp threadsSingle-threaded*Multi-threaded
Replication❌ No✅ Built-in❌ No
Pub/Sub❌ No✅ Built-in❌ No
Eviction PoliciesLibrary-dependent8 policies (LRU, LFU, etc.)LRU only
Cluster Mode❌ No✅ Redis ClusterClient-side hashing
Scripting❌ No✅ Lua / Functions❌ No
Memory EfficiencyBest (no serialization)GoodBetter than Redis**

* Redis 6+ has I/O threads for network handling, but command execution is still single-threaded.
** Memcached's slab allocator is more memory-efficient for simple key-value pairs.

Performance Benchmarks

Real-world latency comparison (same hardware, same network):

Operation           Local Cache    Redis         Memcached
─────────────────────────────────────────────────────────
Simple GET          0.001ms        0.3-0.8ms     0.3-0.7ms
Simple SET          0.001ms        0.3-0.8ms     0.3-0.7ms
Batch GET (100)     0.01ms         1-3ms         1-2ms
1KB value GET       0.001ms        0.4-1ms       0.3-0.8ms
100KB value GET     0.001ms        1-3ms         0.8-2ms
Counter INCREMENT   0.001ms        0.3-0.8ms     0.3-0.7ms

Key takeaway: Local cache is 100-1000x faster than any network-based solution. Redis and Memcached are similar for simple operations, but Memcached has a slight edge for pure key-value workloads due to its multi-threaded architecture.

Throughput Comparison

On a single server with 32 cores:

                    Redis           Memcached
────────────────────────────────────────────
GET ops/sec         100,000-200,000   200,000-600,000
SET ops/sec         100,000-200,000   200,000-500,000
Concurrent clients  10,000+           50,000+

Memcached's multi-threaded design gives it 2-3x throughput advantage for simple get/set operations. However, Redis makes up for this with pipelining, Lua scripts, and data structures that reduce round trips.


Part 3: When to Use What

Use Local Cache When...

1. You need the absolute fastest reads:

// Hot configuration that every request needs
const configCache = new Map<string, AppConfig>();
 
function getConfig(): AppConfig {
  const cached = configCache.get("app-config");
  if (cached) return cached; // ~1μs — no network hop
 
  const config = loadFromDatabase();
  configCache.set("app-config", config);
  return config;
}

2. Data is read-heavy and rarely changes:

  • Feature flags
  • Application configuration
  • Static reference data (country codes, currency rates updated daily)
  • Compiled templates or regex patterns

3. Inconsistency between instances is acceptable:

  • Each server can have a slightly different cache state for a short period
  • Example: blog post view counts don't need to be perfectly synchronized

4. You want zero infrastructure dependencies:

  • Local development without Docker
  • Serverless functions (Lambda, Cloud Functions) where external connections are expensive
  • Edge computing with limited network access

Use Redis When...

1. Multiple app instances need shared state:

// Session management — user can hit any server
await redis.hset(`session:${sessionId}`, {
  userId: "123",
  role: "admin",
  lastAccess: Date.now().toString(),
});
await redis.expire(`session:${sessionId}`, 3600); // 1 hour
 
// Any server can read this session
const session = await redis.hgetall(`session:${sessionId}`);

2. You need data structures beyond simple key-value:

# Leaderboard with sorted sets
r.zadd("leaderboard:weekly", {"alice": 2500, "bob": 1800, "charlie": 3200})
top_10 = r.zrevrange("leaderboard:weekly", 0, 9, withscores=True)
 
# Rate limiting with atomic operations
pipe = r.pipeline()
pipe.incr(f"rate:{user_id}")
pipe.expire(f"rate:{user_id}", 60)
current, _ = pipe.execute()
if current > 100:
    raise RateLimitExceeded()
 
# Job queue with lists
r.lpush("jobs:email", json.dumps({"to": "alice@example.com", "template": "welcome"}))
job = r.brpop("jobs:email", timeout=30)  # Blocking pop — worker waits for jobs

3. You need persistence and high availability:

  • Cache warmup after restart (RDB/AOF recovery)
  • Master-replica failover with Redis Sentinel
  • Cross-datacenter replication

4. You need Pub/Sub for cache invalidation:

// Publisher — when data changes
await redis.publish("cache:invalidate", JSON.stringify({
  key: "user:123",
  action: "update",
}));
 
// Subscriber — all app instances listen
const sub = redis.duplicate();
sub.subscribe("cache:invalidate");
sub.on("message", (channel, message) => {
  const { key } = JSON.parse(message);
  localCache.delete(key); // Invalidate local cache
});

Use Memcached When...

1. You only need simple key-value caching at massive scale:

# Facebook's use case: billions of simple key-value lookups
# Profile data, feed items, social graph edges
client.set(f"profile:{user_id}", serialize(profile), expire=3600)
profile = deserialize(client.get(f"profile:{user_id}"))

2. You need maximum throughput on multi-core servers:

  • Memcached's multi-threaded architecture utilizes all cores
  • Ideal when you have 32+ core machines dedicated to caching

3. You want the simplest possible caching layer:

  • No clustering to configure
  • No persistence to manage
  • No replication to monitor
  • Just raw speed for simple get/set

4. Memory efficiency is critical for large datasets:

  • Memcached's slab allocator is optimized for uniform-sized objects
  • Less memory overhead per key than Redis

Decision Flowchart


Part 4: Multi-Tier Caching Architecture

The best systems don't choose one — they combine all three in a layered architecture.

The Cache Hierarchy

This is exactly the architecture shown in the diagram at the beginning — users hit a CDN (static cache), then a load balancer distributes to app servers (each with local cache), which share a Redis cache layer, backed by a database with read replicas.

Implementation: Two-Tier Cache

import Redis from "ioredis";
import { LRUCache } from "lru-cache";
 
// L1: Local cache — small, fast, per-instance
const localCache = new LRUCache<string, unknown>({
  max: 5_000,           // Max 5000 entries
  ttl: 30_000,          // 30 second TTL (short — reduces stale data)
});
 
// L2: Redis — shared across all instances
const redis = new Redis("redis://redis.internal:6379");
 
async function tieredGet<T>(key: string): Promise<T | null> {
  // L1: Check local cache first (~1μs)
  const local = localCache.get(key) as T | undefined;
  if (local !== undefined) {
    return local;
  }
 
  // L2: Check Redis (~1ms)
  const remote = await redis.get(key);
  if (remote) {
    const parsed = JSON.parse(remote) as T;
    localCache.set(key, parsed); // Promote to L1
    return parsed;
  }
 
  // L3: Cache miss — caller should fetch from database
  return null;
}
 
async function tieredSet<T>(key: string, value: T, ttlSec: number): Promise<void> {
  // Write to both layers
  localCache.set(key, value);
  await redis.set(key, JSON.stringify(value), "EX", ttlSec);
}
 
async function tieredInvalidate(key: string): Promise<void> {
  localCache.delete(key);
  await redis.del(key);
  // Notify other instances to clear their local cache
  await redis.publish("cache:invalidate", key);
}

Implementation: Full Cache-Aside Pattern

import json
import redis
from cachetools import TTLCache
 
# L1: Local cache
local_cache = TTLCache(maxsize=5000, ttl=30)
 
# L2: Redis
redis_client = redis.Redis(host="redis.internal", decode_responses=True)
 
def get_user(user_id: str) -> dict | None:
    cache_key = f"user:{user_id}"
 
    # L1: Local cache (~1μs)
    if cache_key in local_cache:
        return local_cache[cache_key]
 
    # L2: Redis (~1ms)
    cached = redis_client.get(cache_key)
    if cached:
        user = json.loads(cached)
        local_cache[cache_key] = user  # Promote to L1
        return user
 
    # L3: Database (~10ms)
    user = db.query("SELECT * FROM users WHERE id = %s", user_id)
    if user:
        redis_client.setex(cache_key, 300, json.dumps(user))  # 5 min in Redis
        local_cache[cache_key] = user  # 30s in local
    return user
 
def update_user(user_id: str, data: dict) -> None:
    db.execute("UPDATE users SET ... WHERE id = %s", user_id)
    cache_key = f"user:{user_id}"
 
    # Invalidate both layers
    local_cache.pop(cache_key, None)
    redis_client.delete(cache_key)
 
    # Notify other instances
    redis_client.publish("cache:invalidate", cache_key)

Cache Invalidation Across Instances

The biggest challenge with multi-tier caching is keeping local caches in sync. When data changes on one server, other servers' local caches become stale.

// Listen for invalidation events on every app instance
const subscriber = redis.duplicate();
subscriber.subscribe("cache:invalidate");
subscriber.on("message", (channel: string, key: string) => {
  localCache.delete(key);
  console.log(`Invalidated local cache: ${key}`);
});

TTL as a safety net: Even without explicit invalidation, short TTLs on local cache (15-60 seconds) ensure stale data doesn't persist long.


Part 5: Common Pitfalls

1. Cache Stampede (Thundering Herd)

When a popular cache key expires, hundreds of concurrent requests all miss the cache and hit the database simultaneously.

Solutions:

// Solution 1: Mutex lock — only one request fetches from DB
const locks = new Map<string, Promise<unknown>>();
 
async function getWithLock<T>(key: string, fetchFn: () => Promise<T>): Promise<T> {
  // Check cache
  const cached = await tieredGet<T>(key);
  if (cached) return cached;
 
  // Check if another request is already fetching
  const existing = locks.get(key);
  if (existing) return existing as Promise<T>;
 
  // This request wins — fetch from DB
  const promise = fetchFn().then(async (value) => {
    await tieredSet(key, value, 300);
    locks.delete(key);
    return value;
  });
 
  locks.set(key, promise);
  return promise;
}
# Solution 2: Early expiration — refresh before TTL expires
def get_with_early_refresh(key: str, fetch_fn, ttl: int = 300) -> any:
    cached = redis_client.get(key)
    if cached:
        data = json.loads(cached)
        remaining_ttl = redis_client.ttl(key)
 
        # Refresh if less than 20% TTL remaining
        if remaining_ttl < ttl * 0.2:
            threading.Thread(target=_refresh_cache, args=(key, fetch_fn, ttl)).start()
 
        return data
 
    # Cache miss
    value = fetch_fn()
    redis_client.setex(key, ttl, json.dumps(value))
    return value

2. Cache Penetration

Requests for data that doesn't exist bypass the cache every time and hit the database.

// ❌ Bad: Every request for non-existent user hits DB
async function getUser(id: string) {
  const cached = await redis.get(`user:${id}`);
  if (cached) return JSON.parse(cached);
  const user = await db.findUser(id); // Returns null — nothing cached
  return user;
}
 
// ✅ Good: Cache null results too
async function getUser(id: string) {
  const cached = await redis.get(`user:${id}`);
  if (cached === "NULL") return null; // Cached negative result
  if (cached) return JSON.parse(cached);
 
  const user = await db.findUser(id);
  if (user) {
    await redis.set(`user:${id}`, JSON.stringify(user), "EX", 300);
  } else {
    await redis.set(`user:${id}`, "NULL", "EX", 60); // Short TTL for negatives
  }
  return user;
}

3. Hot Key Problem

A single key receiving massive traffic (celebrity profile, viral post) can overwhelm one Redis node.

// Solution: Replicate hot keys across multiple local caches + random suffix
async function getHotKey<T>(baseKey: string, fetchFn: () => Promise<T>): Promise<T> {
  // L1: Always check local cache first — distributes load
  const local = localCache.get(baseKey);
  if (local) return local as T;
 
  // L2: Spread across Redis replicas with random suffix
  const shardKey = `${baseKey}:shard:${Math.floor(Math.random() * 3)}`;
  const cached = await redis.get(shardKey);
  if (cached) {
    const value = JSON.parse(cached) as T;
    localCache.set(baseKey, value); // Absorb future hits locally
    return value;
  }
 
  const value = await fetchFn();
  // Write to all shards
  await Promise.all([
    redis.set(`${baseKey}:shard:0`, JSON.stringify(value), "EX", 300),
    redis.set(`${baseKey}:shard:1`, JSON.stringify(value), "EX", 300),
    redis.set(`${baseKey}:shard:2`, JSON.stringify(value), "EX", 300),
  ]);
  localCache.set(baseKey, value);
  return value;
}

4. Serialization Overhead

Converting objects to JSON and back adds CPU cost that's easy to underestimate.

// ❌ Slow: JSON.parse on every cache hit
const user = JSON.parse(await redis.get("user:123")); // ~0.5ms for large objects
 
// ✅ Better: Use Redis Hashes to avoid full serialization
await redis.hset("user:123", { name: "Alice", role: "admin", age: "30" });
const name = await redis.hget("user:123", "name"); // No parsing needed
 
// ✅ Better: Use MessagePack for binary serialization (2-3x faster than JSON)
import { encode, decode } from "@msgpack/msgpack";
await redis.setBuffer("user:123", Buffer.from(encode(user)));
const cached = await redis.getBuffer("user:123");
const user = decode(cached);

5. Memory Pressure

Local caches that grow unchecked steal memory from your application.

// ❌ Dangerous: unbounded cache
Map<String, Object> cache = new HashMap<>(); // Grows forever → OutOfMemoryError
 
// ✅ Safe: bounded with eviction
LoadingCache<String, Object> cache = Caffeine.newBuilder()
    .maximumSize(10_000)          // Hard cap
    .expireAfterWrite(Duration.ofMinutes(5))
    .maximumWeight(50_000_000)    // 50MB weight limit
    .weigher((key, value) -> estimateSize(value))
    .removalListener((key, value, cause) ->
        log.info("Evicted {} due to {}", key, cause))
    .build(key -> loadFromDb(key));

Part 6: Real-World Architectures

Small Application (< 1K RPM)

Just use local cache. Redis is overkill for a single server handling a few hundred requests per minute.

Medium Application (1K-100K RPM)

Add Redis as a shared cache layer. Multiple app instances need consistent cache state.

Large Application (100K+ RPM)

Full multi-tier: CDN → Load Balancer → App Servers (with local cache) → Redis Cluster → Database with read replicas. This is the architecture from the diagram we started with.

Massive Scale (Facebook/Twitter Level)

At extreme scale, companies use Memcached for simple, high-throughput key-value lookups (profile data, feed items) and Redis for structured data (leaderboards, sessions, rate limiting). Local cache sits in front of both.


Summary and Key Takeaways

Quick Reference

ScenarioBest Choice
Single server, read-heavy configLocal Cache
Multi-server, needs shared stateRedis
Need data structures (sorted sets, lists)Redis
Need persistence across restartsRedis
Simple get/set at massive throughputMemcached
Pub/Sub for cache invalidationRedis
Serverless / edge computingLocal Cache
Budget-conscious, single instanceLocal Cache
Already using Redis for other featuresRedis (consolidate)

The Golden Rule

Start simple, add layers when you have evidence you need them:

  1. Start with local cache — it's free, fast, and requires no infrastructure
  2. Add Redis when you need shared state across multiple servers
  3. Add Memcached only when Redis throughput becomes a bottleneck for simple key-value operations
  4. Combine layers when you need both maximum speed (local) and consistency (distributed)

What's Next?

Now that you understand when to use each caching solution:

Caching is one of those topics where the theory is simple but the devil is in the details. The best way to learn is to start with local cache, measure your hit rates, and add complexity only when you need it. Premature optimization applies to caching architecture too.

📬 Subscribe to Newsletter

Get the latest blog posts delivered to your inbox every week. No spam, unsubscribe anytime.

We respect your privacy. Unsubscribe at any time.

💬 Comments

Sign in to leave a comment

We'll never post without your permission.