Redis Deep Dive: Data Structures, Persistence, Performance, and Operational Patterns#
Redis is an in-memory data store, but calling it a “cache” undersells what it can do. It is a data structure server that happens to be extraordinarily fast. Understanding its data structures, persistence model, and operational characteristics determines whether Redis becomes a reliable part of your architecture or a source of mysterious production incidents.
Data Structures Beyond Key-Value#
Redis supports far more than simple string key-value pairs. Each data structure has specific use cases where it outperforms alternatives.
Strings are the simplest type. They store text, integers, or binary data up to 512MB. Use SET, GET, INCR, DECR, APPEND. Strings are the default choice for caching serialized objects, counters, and simple flags.
Hashes map field-value pairs under a single key, similar to a row in a relational table. Use HSET, HGET, HGETALL, HINCRBY. Hashes are ideal for session storage where each session has multiple attributes (user_id, role, last_access), and for representing objects without serializing them into a single string.
Lists are ordered sequences of strings, implemented as linked lists. Use LPUSH, RPUSH, LPOP, RPOP, LRANGE, BRPOP. Lists are the backbone of Redis-based queues. LPUSH to add work, BRPOP to block-wait for items – this gives you a simple, reliable job queue.
Sets are unordered collections of unique strings. Use SADD, SMEMBERS, SISMEMBER, SINTER, SUNION. Sets handle tagging systems, unique visitor tracking, and any scenario where you need fast membership checks and set operations.
Sorted Sets (ZSETs) are like sets but each member has a score, and the set is ordered by score. Use ZADD, ZRANGE, ZRANGEBYSCORE, ZRANK, ZINCRBY. Sorted sets power leaderboards, priority queues, time-series data indexed by timestamp, and rate limiting with sliding windows.
Streams are append-only log structures introduced in Redis 5.0. Use XADD, XREAD, XREADGROUP, XACK. Streams provide Kafka-like functionality with consumer groups, acknowledgment, and pending entry lists. They are the right choice when you need durable message processing with consumer groups inside Redis.
HyperLogLog is a probabilistic data structure for cardinality estimation. Use PFADD, PFCOUNT, PFMERGE. It uses only 12KB of memory regardless of the number of unique elements, with a standard error of 0.81%. Use it for counting unique visitors, unique events, or any “how many distinct things” question where approximate counts are acceptable.
Bitmaps treat strings as bit arrays. Use SETBIT, GETBIT, BITCOUNT, BITOP. Bitmaps are efficient for tracking binary states across large populations – daily active users (one bit per user ID), feature flags, or boolean attributes across millions of entities.
Common Patterns#
Caching with TTL. The most common Redis pattern. SET key value EX 3600 stores a value with a one-hour expiration. Use GET to read, check for nil (cache miss), and populate from the source of truth on miss. Set TTLs aggressively – cache invalidation bugs are the most common source of stale data issues.
Session Storage. Store sessions as hashes: HSET session:abc123 user_id 42 role admin last_access 1708000000. Hashes let you read or update individual session fields without deserializing the entire session.
Rate Limiting. Use sorted sets with timestamps. Add each request as a member with the current timestamp as the score: ZADD rate:user:42 1708000000 "req-uuid". Remove old entries: ZREMRANGEBYSCORE rate:user:42 0 (now - window). Count remaining: ZCARD rate:user:42. If count exceeds the limit, reject.
Queues. Use lists with LPUSH (enqueue) and BRPOP (dequeue with blocking). BRPOP blocks the client until an item is available, avoiding polling. For reliable queues, use RPOPLPUSH or LMOVE to atomically move items from the work queue to a processing queue, providing at-least-once delivery.
Leaderboards. Sorted sets with scores: ZINCRBY leaderboard 10 "player:42" to add points, ZREVRANGE leaderboard 0 9 WITHSCORES for the top 10.
Pub/Sub. SUBSCRIBE channel and PUBLISH channel message for real-time messaging. Messages are fire-and-forget – if no subscriber is listening, the message is lost. For durable messaging, use Streams instead.
Distributed Locks. Use SET lock:resource value NX EX 30 to acquire a lock with a 30-second expiration. NX ensures only one client can acquire it. For multi-node setups, use the RedLock algorithm (acquire locks on a majority of independent Redis instances). Always set an expiration to avoid deadlocks if the lock holder crashes.
Persistence: RDB vs AOF#
Redis is in-memory, but offers two persistence mechanisms to survive restarts.
RDB (Redis Database) snapshots create a point-in-time binary dump of the entire dataset. Redis forks a child process to write the snapshot, so the parent continues serving requests. Configure with save 900 1 (snapshot every 900 seconds if at least 1 key changed). RDB files are compact, fast to load on restart, and ideal for backups. The downside: you lose all writes since the last snapshot on a crash.
AOF (Append-Only File) logs every write command. On restart, Redis replays the log to reconstruct the dataset. Configure appendonly yes and choose a sync policy: appendfsync always (safest, slowest), appendfsync everysec (recommended – at most 1 second of data loss), or appendfsync no (OS decides when to flush). AOF files grow over time, but Redis rewrites them periodically to compact the log.
Recommendation: use both. Enable AOF with everysec for durability (minimal data loss on crash), and keep RDB enabled for backups and faster restarts. During disaster recovery, RDB loads faster than replaying a large AOF. Use RDB snapshots for offsite backups to S3 or similar storage.
Memory Management#
Redis lives and dies by memory. If it runs out, bad things happen.
Set maxmemory to cap usage. When the limit is reached, Redis applies an eviction policy:
- allkeys-lru: Evict the least recently used key from all keys. Best general-purpose caching policy.
- volatile-lru: Evict LRU keys only among those with an expiration set. Use when you mix cached and persistent data.
- allkeys-lfu: Evict the least frequently used key. Better than LRU when access patterns are skewed (some keys accessed rarely but recently).
- noeviction: Return errors on write commands when memory is full. Use when data loss is unacceptable – the application must handle the error.
Monitor memory with INFO memory. Key fields: used_memory (total allocated), used_memory_rss (resident set size from the OS), mem_fragmentation_ratio (RSS / used_memory – if much greater than 1.0, fragmentation is high). Set key expirations proactively to keep memory under control.
Performance Characteristics#
Redis processes commands in a single thread. This simplifies everything – no locks, no race conditions – but means a single slow command blocks all other clients.
I/O threads (Redis 6+) handle network I/O in parallel while command execution remains single-threaded. Enable with io-threads 4 and io-threads-do-reads yes on multi-core systems to improve throughput for I/O-bound workloads.
Pipelining batches multiple commands into a single network round trip. Instead of sending one command and waiting for the response, send 100 commands at once and read all 100 responses. This reduces round-trip time overhead dramatically – often a 5-10x throughput improvement.
Lua scripting runs scripts atomically on the server. Use EVAL to execute multi-step operations without race conditions. The entire script runs as a single command – no other client can interleave commands. This replaces many patterns that would otherwise require distributed locks.
Never use KEYS in production. KEYS * scans the entire keyspace in O(n) and blocks the server the entire time. On a Redis instance with millions of keys, this can block for seconds. Use SCAN instead – it iterates the keyspace incrementally, returning a small batch per call with a cursor for the next batch.
Replication#
Redis uses leader-follower (master-replica) replication. The leader handles all writes and asynchronously replicates to followers. Followers can serve read traffic, scaling read capacity horizontally.
Replication is async by default. The leader does not wait for followers to confirm writes. This means a follower may serve stale reads, and if the leader crashes before replication, those writes are lost. Use the WAIT command if you need synchronous confirmation that a write reached N replicas before returning to the client.
Configure with replicaof <leader-ip> <leader-port> on each follower. On initial sync, the leader creates an RDB snapshot and sends it to the follower, then streams subsequent commands.
Redis Sentinel#
Sentinel provides automatic failover for leader-follower setups. It monitors Redis instances, detects leader failure, promotes a follower to leader, and reconfigures other followers to replicate from the new leader.
Sentinel uses quorum-based leader election. Run at least 3 Sentinel instances to tolerate one failure. Configure with sentinel monitor mymaster <leader-ip> <leader-port> 2 (quorum of 2 out of 3 Sentinels must agree on failure).
Clients must use Sentinel-aware connection libraries. Instead of connecting directly to a Redis host, the client connects to Sentinel, asks which node is the current leader, and reconnects automatically on failover. Most Redis client libraries support this natively.
Redis Cluster#
Redis Cluster provides horizontal sharding across multiple nodes. Data is distributed across 16384 hash slots. Each key is hashed to a slot, and each node owns a subset of slots.
Multi-key operations (MGET, MSET, pipeline with multiple keys) only work if all keys hash to the same slot. Use hash tags to force keys to the same slot: {user:42}:profile and {user:42}:sessions both hash based on user:42, landing on the same node.
Cluster requires at least 6 nodes for production (3 leaders + 3 replicas for fault tolerance). Nodes communicate via a gossip protocol, and the cluster can automatically reshard slots between nodes.
Monitoring#
Monitor Redis with these commands and metrics:
INFO stats: command throughput, keyspace hits/misses (hit ratio = hits / (hits + misses) – aim for >95%)SLOWLOG GET 10: last 10 commands that exceeded the slowlog threshold (default 10ms)CLIENT LIST: connected clients, their state, and what commands they are runningMEMORY DOCTOR: automated memory analysis and recommendationsLATENCY LATEST: latency events by type (fork, aof, rdb)
Key metrics to alert on: connected_clients (sudden spikes indicate connection leaks), used_memory approaching maxmemory, evicted_keys (eviction means you are at capacity), keyspace_hits and keyspace_misses (declining hit ratio means your cache is not effective), and replication lag (master_repl_offset vs slave_repl_offset).
Common Gotchas#
KEYS blocks the server. This cannot be overstated. One developer running KEYS * in production during debugging can freeze the entire Redis instance. Use SCAN for iterating keys, DBSIZE for total count.
No encryption by default. Redis before version 6 has no TLS support. Data travels in plaintext, and there is no authentication beyond a single password. Use stunnel or a TLS proxy in front of Redis, or upgrade to Redis 6+ and enable TLS natively. At minimum, ensure Redis is only accessible within your VPC or private network.
Fork doubles memory briefly. When Redis forks for RDB snapshots or AOF rewrites, the child process gets a copy-on-write view of memory. If the dataset is being actively written to during the fork, the OS must copy modified pages, potentially doubling memory usage temporarily. Ensure your system has enough free memory to handle this – a 16GB Redis instance should run on a host with at least 24GB available.