Back to blog

Redis Source Code Explained: A Tour of the GitHub Repository

redisdatabasecachingbackendperformance
Redis Source Code Explained: A Tour of the GitHub Repository

Introduction

You've used Redis. You know the commands — SET, GET, LPUSH, ZADD. But have you ever wondered what actually happens when Redis executes one of those commands? What does the C code look like?

The Redis source code lives at github.com/redis/redis and it's one of the most readable, well-commented C codebases in open source. Reading it doesn't just teach you Redis — it teaches you systems programming patterns used in databases, game engines, and operating systems.

This post walks you through five key source files and extracts the ideas that explain why Redis is so fast.

What You'll Learn:
✅ How Redis represents every value using one universal struct (robj)
✅ How SET and GET actually work in C
✅ Why Redis counters use almost no memory (integer encoding)
✅ How Redis lists automatically switch between two data structures
✅ The incremental rehashing trick that keeps Redis latency smooth
✅ How a single-threaded event loop handles 100,000+ connections

Prerequisites:

  • Basic Redis knowledge (Learning Redis Fundamentals recommended)
  • Comfort reading code in any language (C syntax will be explained)
  • Curiosity about how databases work internally

The Repository Layout

Clone the repository and you'll see this structure:

git clone https://github.com/redis/redis.git
cd redis/src
ls *.c | head -20

The src/ directory contains ~100 C files. Here are the ones we'll study:

FileWhat It Does
server.hCore data structure definitions
t_string.cAll string commands (GET, SET, INCR…)
t_list.cAll list commands (LPUSH, RPOP, BLPOP…)
dict.cHash table used everywhere
ae.cEvent loop — the engine of Redis

server.h — The Blueprint

Every Redis file includes server.h. It's the central header that defines the structs everything else depends on.

The robj — One Type to Rule Them All

In Redis, every value you store — whether a string, a list, a sorted set, or a hash — is wrapped in a single struct called robj (Redis Object):

// Simplified from server.h
typedef struct redisObject {
    unsigned type:4;      // OBJ_STRING, OBJ_LIST, OBJ_SET...
    unsigned encoding:4;  // How it's stored internally
    unsigned lru:LRU_BITS; // Last access time for eviction
    int refcount;          // Reference counting for memory
    void *ptr;             // Pointer to actual data
} robj;

The type field holds one of these constants:

#define OBJ_STRING 0
#define OBJ_LIST   1
#define OBJ_SET    2
#define OBJ_ZSET   3  // Sorted Set
#define OBJ_HASH   4
#define OBJ_STREAM 6

This design is elegant. Commands like TYPE key just read robj->type. EXPIRE works on any type because it's set on the key, not on the value. The checkType() function validates that you're using the right command on the right type.

The Database Struct

Redis can hold up to 16 logical databases. Each one is a redisDb:

typedef struct redisDb {
    kvstore *keys;           // The actual key-value pairs
    kvstore *expires;        // Per-key TTL tracking
    dict *blocking_keys;     // Keys clients are BLPOPing on
    int id;                  // Database number (SELECT 0, SELECT 1...)
} redisDb;

When you run EXPIRE user:1 3600, Redis adds an entry to expires mapping user:1 → Unix timestamp. A background timer checks this dict and deletes expired keys lazily (on access) or actively (every 100ms).

The Server Struct

All global state lives in a single redisServer struct. A few notable fields:

struct redisServer {
    pid_t pid;                  // Process ID
    redisDb *db;                // Array of 16 databases
    dict *commands;             // Command name → handler function
    list *clients;              // All connected clients
    aeEventLoop *el;            // The event loop
    size_t stat_numcommands;    // Total commands processed
};

The commands dict is how Redis routes "SET" to the setCommand() function. Redis parses the command name from the client's input, looks it up in this dict, and calls the handler.


t_string.c — How GET and SET Work

This file implements every string command. Let's read GET first — it's wonderfully simple.

Reading the GET Command

int getGenericCommand(client *c) {
    kvobj *o;
 
    // 1. Look up the key (c->argv[1] is the key name)
    if ((o = lookupKeyReadOrReply(c, c->argv[1],
            shared.null[c->resp])) == NULL)
        return C_OK;  // Key not found — already replied with null
 
    // 2. Make sure it's a string, not a list or hash
    if (checkType(c, o, OBJ_STRING)) return C_ERR;
 
    // 3. Send the value back to the client
    addReplyBulk(c, o);
    return C_OK;
}
 
void getCommand(client *c) {
    getGenericCommand(c);
}

Three steps: look up, type check, reply. The lookupKeyReadOrReply function handles the expires dict check internally — if the key has expired, it deletes it right here and returns NULL.

The SET Command is More Complex

SET has many options: EX, PX, NX, XX, GET, KEEPTTL. But one implementation detail is particularly clever: relative expiry times are converted to absolute timestamps before replication.

// When you run: SET key value EX 60
// Redis internally rewrites it to: SET key value PXAT <unix_ms + 60000>
// before sending to replicas

Why? Because replica servers might receive the command slightly later. If Redis replicated EX 60 literally, the replica's expiry would be off by the replication lag. Converting to an absolute timestamp (EXAT/PXAT) ensures all replicas expire the key at exactly the same moment.

Integer Encoding — No Memory Allocation Needed

This is one of the most clever micro-optimizations in Redis. When you store a counter:

SET visits 0
INCR visits  # Now it's 1
INCR visits  # Now it's 2

Redis doesn't store "1" or "2" as a string in memory. Instead, it packs the integer directly into the pointer field of the robj:

// Instead of allocating a string "1001" on the heap...
// Redis stores the number directly in the pointer slot:
o->ptr = (void*)((long)value);
o->encoding = OBJ_ENCODING_INT;

A C pointer is 8 bytes on 64-bit systems. A long is also 8 bytes. So Redis reuses the pointer slot to store the integer itself. No heap allocation. No string parsing. Just a bitcast.

This is why INCR on a counter is faster than setting a string key — and why Redis counters use a constant, tiny amount of memory regardless of the value.

The limits: if the integer doesn't fit in a C long (roughly ±9.2 × 10¹⁸), Redis falls back to string encoding.


t_list.c — Lists Are Smarter Than You Think

The list implementation reveals one of Redis's core design philosophies: automatically use the most memory-efficient structure based on the actual data size.

Two Backing Structures

Redis lists aren't always implemented as linked lists. Internally, they choose between two representations:

Listpack is a compact, contiguous memory format. Imagine packing elements tightly in a byte array with length prefixes — no pointers, no overhead, CPU cache-friendly.

Quicklist is a doubly-linked list where each node holds a listpack. It's used for larger lists where you need O(1) push/pop at both ends.

The Push Implementation Shows Both

void listTypePush(robj *subject, robj *value, int where) {
    if (subject->encoding == OBJ_ENCODING_QUICKLIST) {
        int pos = (where == LIST_HEAD) ?
            QUICKLIST_HEAD : QUICKLIST_TAIL;
        quicklistPush(subject->ptr, value->ptr,
                      sdslen(value->ptr), pos);
    } else if (subject->encoding == OBJ_ENCODING_LISTPACK) {
        subject->ptr = (where == LIST_HEAD) ?
            lpPrepend(subject->ptr, ...) :
            lpAppend(subject->ptr, ...);
    }
}

The same LPUSH command dispatches differently based on current encoding. The caller never sees this.

Automatic Promotion with Hysteresis

When a listpack list grows beyond list-max-listpack-size (default: 128 elements), Redis calls listTypeConvert to promote it to a quicklist. This is automatic and invisible to the user.

But what about shrinking? Redis doesn't demote back at 128 — it uses a lower threshold. This hysteresis prevents the structure from thrashing between formats when elements hover near the boundary:

Promote to quicklist: when elements > 128
Demote back to listpack: when elements < 64  (approx)

This pattern — promote at N, demote at N/2 — appears in many performance-sensitive systems (CPU branch predictors, buffer resizing algorithms) for the same reason: avoid the cost of constant switching at the boundary.


dict.c — The Hash Table That Powers Everything

Redis's dict is the most important data structure in the codebase. It implements the key-value store itself, the command registry, Redis Sets, Hash values, and more.

Structure: Chained Hash Table

struct dictEntry {
    struct dictEntry *next;   // Next entry in collision chain
    void *key;
    union {
        void *val;
        uint64_t u64;
        int64_t s64;
        double d;             // No allocation needed for numbers
    } v;
};

Collisions are resolved by chaining — each bucket is a linked list of entries. The union in the value field is the same space-saving trick we saw in robj: if your value is a number, no heap allocation is needed.

Bucket indices use bitwise AND instead of modulo:

// Instead of: index = hash % table_size  (slow division)
// Redis does:  index = hash & (table_size - 1)  (fast AND)

This only works when table_size is a power of two — and Redis always allocates tables in powers of two for exactly this reason.

Incremental Rehashing — The Crown Jewel

When a hash table gets too full (load factor > 1), it needs to resize. A naive approach: allocate a new bigger table, copy all entries, delete the old table. For a database with millions of keys, this would cause a visible latency spike — potentially hundreds of milliseconds of blocking.

Redis's solution is incremental rehashing: maintain two tables simultaneously and migrate entries gradually over many operations.

// dict holds two tables during rehashing
struct dict {
    dictEntry **ht_table[2];  // [0] = old, [1] = new
    long ht_used[2];          // How many entries in each
    long rehashidx;           // Current migration progress (bucket index)
};

The migration function advances the process a few buckets at a time:

int dictRehash(dict *d, int n) {
    while (n-- && d->ht_used[0] != 0) {
        // Move entries from old table[0] bucket to new table[1]
        rehashEntriesInBucketAtIndex(d, d->rehashidx);
        d->rehashidx++;  // Advance the cursor
    }
    return !dictCheckRehashingCompleted(d);
}

During rehashing:

  • Lookups check table[0] first, then table[1]
  • Writes go directly to table[1]
  • A background timer (serverCron, runs every 100ms) calls dictRehash to drain the migration during idle time

The result: no single operation takes the full migration cost. The work is spread across thousands of normal requests, each paying only a tiny incremental cost.


ae.c — The Event Loop That Makes It All Work

Redis is famously single-threaded for command execution. One thread. No locks. 100,000+ connections. How?

The answer is ae.c — a 500-line event loop implementing the reactor pattern.

Two Types of Events

typedef struct aeFileEvent {
    int mask;              // AE_READABLE | AE_WRITABLE
    aeFileProc *rfileProc; // Called when socket is readable
    aeFileProc *wfileProc; // Called when socket is writable
    void *clientData;
} aeFileEvent;
 
typedef struct aeTimeEvent {
    long long id;
    monotime when;         // When to fire (microseconds)
    aeTimeProc *timeProc;  // Callback function
    struct aeTimeEvent *next;
} aeTimeEvent;

File events fire when a network socket has data ready to read (a client sent a command) or is ready to write (response buffer can be flushed). Time events fire at a scheduled future time — used for serverCron, expiry checks, and stats collection.

The Main Loop

void aeMain(aeEventLoop *eventLoop) {
    eventLoop->stop = 0;
    while (!eventLoop->stop) {
        aeProcessEvents(eventLoop,
            AE_ALL_EVENTS |
            AE_CALL_BEFORE_SLEEP |
            AE_CALL_AFTER_SLEEP);
    }
}

aeProcessEvents is where the magic happens:

  1. Calculate how long until the next time event fires
  2. Call the OS multiplexer (epoll_wait on Linux, kevent on macOS) with that timeout — blocking until a socket is ready OR the timeout expires
  3. For each ready socket, call its handler (read command / write response)
  4. Check time events and fire any that have elapsed

Platform Abstraction

Redis supports four multiplexing backends behind a common interface:

// ae.c selects the best available backend at compile time:
#ifdef HAVE_EVPORT
#include "ae_evport.c"   // Solaris
#elif defined(HAVE_EPOLL)
#include "ae_epoll.c"    // Linux (default)
#elif defined(HAVE_KQUEUE)
#include "ae_kqueue.c"   // macOS/BSD
#else
#include "ae_select.c"   // Fallback (any POSIX system)
#endif

The rest of ae.c never calls epoll directly — it calls aeApiPoll, which each backend implements. This is the strategy pattern in C.

Why Single-Threaded Works So Well

The event loop model avoids every multithreading headache:

  • No mutexes — only one thread accesses data structures
  • No deadlocks — no locks to acquire
  • No context switch overhead — the OS isn't juggling threads

The bottleneck for most Redis workloads isn't CPU — it's network I/O. The OS kernel handles watching thousands of file descriptors simultaneously. Redis just reacts to what's ready. One handler at a time, zero blocking, zero locking.

Note: Redis 6.0+ added I/O threads for reading/writing network data in parallel, but the command execution itself remains single-threaded. This preserves the simplicity while improving raw throughput on very high-bandwidth workloads.


Putting It All Together

Let's trace what happens when you run SET counter 0 followed by INCR counter:

Every layer contributes to the speed:

  • ae.c woke up only because the socket was ready (no polling)
  • dict.c found the key in O(1) with bitwise AND
  • t_string.c incremented without allocating memory
  • The reply is a tiny 4-byte string

Key Takeaways from the Source Code

Reading Redis source code reveals a consistent set of design principles:

1. Store numbers as numbers, not strings
Integer encoding packs values into pointer slots, eliminating heap allocations for the most common counter use case.

2. Use the right structure for the size
Listpack for small lists, quicklist for large ones. The encoding is hidden from users but saves significant memory for typical workloads where most lists are small.

3. Spread expensive work over time
Incremental rehashing ensures no single operation pays the full cost of resizing a hash table with millions of entries.

4. One thread, OS-level multiplexing
The event loop scales to thousands of connections without threads or locks, because network I/O wait is handled by the kernel, not by Redis threads.

5. Everything is an robj
A single polymorphic wrapper makes it trivial to add new data types, eviction policies, and commands without rewriting infrastructure.


How to Explore the Code Yourself

# Clone the repository
git clone https://github.com/redis/redis.git
cd redis
 
# Start with the entry point
vim src/server.c   # Main function, server startup, command table
 
# Find any command handler
grep -n "void setCommand" src/t_string.c
grep -n "void lrangeCommand" src/t_list.c
grep -n "void zaddCommand" src/t_zset.c
 
# Search for where a concept is implemented
grep -rn "OBJ_ENCODING_INT" src/
grep -rn "dictRehash" src/

Suggested reading order:

  1. server.h — Understand the core types (robj, client, redisServer)
  2. t_string.c — Simplest command implementation
  3. dict.c — The hash table used everywhere
  4. ae.c — The event loop
  5. t_list.c, t_zset.c — More complex data types

The Redis codebase uses a consistent style, clear function names, and excellent comments. Don't be intimidated by C — most of the interesting logic reads like pseudocode once you're past the type declarations.


Summary

Redis is fast not by magic, but by making deliberate, layered optimizations at every level:

robj provides a uniform wrapper that makes all data types composable
✅ Integer encoding eliminates heap allocations for counter values
✅ Lists automatically choose compact or linked storage based on size
✅ Incremental rehashing spreads resize costs across thousands of requests
✅ The event loop handles massive concurrency with zero threads and zero locks

The next time you run INCR counter or LPUSH queue item, you'll know exactly what's happening in the C code underneath.


Additional Resources

Redis Source Code:

Related Posts:

📬 Subscribe to Newsletter

Get the latest blog posts delivered to your inbox every week. No spam, unsubscribe anytime.

We respect your privacy. Unsubscribe at any time.

💬 Comments

Sign in to leave a comment

We'll never post without your permission.