Redis Notes for Interview
Redis Fundamentals
Key Concepts
- In-memory database
- Data structures: string, hash, set, sorted set, list, geo, hyperloglog
- Cluster modes: master-slave, sentinel, cluster sharding
- Performance optimization through caching
Why Use Redis?
Performance: When encountering SQL queries that take a long time to execute and have results that don’t change frequently, it’s ideal to store the results in cache. This allows subsequent requests to read from cache, enabling rapid response times.
Concurrency: Under high concurrency, if all requests directly access the database, connection exceptions will occur. Redis serves as a buffer, allowing requests to access Redis first instead of directly hitting the database. (MySQL supports ~1,500 QPS, while Redis supports 20,000-50,000 QPS)
Redis Use Cases
- Caching
- Flash sales/spike traffic handling
- Distributed locking
Redis Single-Threading Model
Key Concepts
- Thread tasks: command processing, I/O handling, persistence, data synchronization
- Version 6.0+: configurable multi-threading support
- epoll + reactor pattern
High Performance Reasons
Memory operations: All data operations occur in memory
I/O Model on Linux: Uses epoll combined with reactor pattern
epoll fundamentals: Manages multiple socket file descriptors
- Red-black tree structure maintains all monitored file descriptors
- Doubly linked list maintains the ready list
- Interrupt mechanism adds ready file descriptors
- Key APIs:
epoll_create
,epoll_ctl
,epoll_wait
- Advantages over poll/select: More efficient for large numbers of connections
Reactor pattern:
- Reactor (Dispatcher): Calls epoll to get available file descriptors and dispatches events
- Acceptor: Handles connection creation events
- Handler: Processes I/O read/write events
Redis Data Types and Underlying Structures
String
- Implementation: SDS (Simple Dynamic String)
- Use cases: User info caching, counters, distributed locks (SETNX)
Hash
- Implementation: ziplist (small data) + hashtable (large data)
- Use cases: Storing objects
List
- Implementation: Doubly linked list or ziplist
- Use cases: Message queues, latest articles list
Set
- Implementation: intset (integer set) or hashtable
- Use cases: Tag systems, mutual friends (SINTER for intersection)
Sorted Set (ZSet)
- Implementation: ziplist + skiplist (skip list)
- Use cases: Leaderboards (ZADD/ZRANGE), delayed queues
Redis Data Persistence
Key Concepts
- AOF (Append Only File)
- RDB (Redis Database)
- Hybrid mode
RDB (Redis Database Backup)
- Mechanism: Periodically generates binary snapshot files of memory data
- Process: Forks child process to avoid blocking main thread
- Frequency: Executes
BGSAVE
every 5+ minutes - Drawback: Potential data loss between snapshots
AOF (Append Only File)
- Mechanism: Records all write operation commands
- Sync strategies:
- Every second (
appendfsync everysec
) - Default, good balance - Every modification (
appendfsync always
) - Safest but slowest - No sync (
appendfsync no
) - Fastest but risky
- Every second (
- Rewrite mechanism: Compacts AOF file by removing redundant commands
Comparison
- RDB: Fast recovery, smaller files, but potential data loss during failures
- AOF: Better data integrity, but larger files and slower recovery
Redis Cluster Deployment Modes
Master-Slave Replication
- Read operations: Both master and slave nodes can handle reads
- Write operations: Only master handles writes, then syncs to slaves
- Benefits: Read scaling, basic fault tolerance
Sentinel Mode
Purpose: Automatic failover when master fails
Key functions:
- Monitoring: Continuously checks master/slave health
- Election: Selects new master when current master fails
- Notification: Informs clients of topology changes
Failure detection:
- Subjective down: Single sentinel marks master as down
- Objective down: Majority of sentinels agree master is down
Master selection criteria:
- Slave priority configuration
- Replication progress (most up-to-date)
- Smallest slave ID
Cluster Sharding
- Purpose: Handles large datasets (>25GB) and utilizes multi-core CPUs
- Hash slots: Uses 16,384 fixed hash slots for data distribution
- Benefits:
- Horizontal scaling
- Automatic failover
- No single point of failure
Caching Patterns and Consistency
Key Concepts
- Cache-Aside pattern
- Read Through, Write Through, Write Back
- Cache invalidation strategies
Cache-Aside Pattern
- Read: Check cache first; if miss, query database and populate cache
- Write: Update database first, then delete cache
Ensuring Cache-Database Consistency
Delayed Double Delete:
- Update database
- Delete cache immediately
- Wait brief period (100-500ms)
- Delete cache again
- Trade-off: Lower cache hit rate for better consistency
Fallback strategies:
- Set cache expiration times
- Use message queues for asynchronous synchronization
Cache Problems: Penetration, Breakdown, Avalanche
Cache Avalanche
- Problem: Many cache keys expire simultaneously or Redis instance crashes
- Solutions:
- Random expiration times
- Circuit breaker and rate limiting
- High-availability cluster deployment
- Service degradation
Cache Penetration
- Problem: Queries for non-existent data bypass cache and hit database
- Solutions:
- Cache null/default values with short TTL
- Bloom Filter:
- Data structure: Bit array + multiple hash functions
- Write: Hash element multiple times, set corresponding bits to 1
- Query: If all hash positions are 1, element might exist (false positives possible)
- Input validation at application layer
Cache Breakdown (Hotspot Invalid)
- Problem: Single popular cache key expires, causing traffic spike to database
- Solutions:
- Never expire hot data
- Use distributed locks to prevent multiple database hits
- Pre-warming cache before expiration
Distributed Locking with Redis
Key Concepts
- SETNX: Only one client can successfully set the same key
- Expiration time: Prevents deadlocks
- Lock renewal: Extends lock duration for long-running operations
Lock Retry on Failure
- Wait time determination: Based on 99th percentile business execution time
- Implementation approaches:
- Sleep-based retry
- Event-driven (listen for DEL events)
- Lua script for atomic timeout retry
Expiration Time Management
- Why needed: Prevents deadlocks when lock holder crashes
- Setting strategy:
- Analyze 99% of business operations completion time
- Set 2x safety margin (e.g., if 99% complete in 1s, set 2s expiration)
- For critical operations, consider 10s or 1 minute
Lock Renewal
- When needed: Business operations exceed expiration time
- Implementation:
- Reset expiration before current expiration
- Daemon thread periodically checks and renews
- Redisson watchdog mechanism: Automatic renewal
Lock Release
- Verification: Always verify lock ownership before release
- Prevention: Avoid releasing locks held by other threads
- Implementation: Use Lua script for atomic check-and-delete
Advanced Patterns
- Redlock Algorithm: Distributed consensus for multiple Redis instances
- SingleFlight Pattern: Prevents cache stampede by allowing only one request for the same resource
Best Practices Summary
- Choose appropriate data structures based on use case
- Implement proper persistence strategy (RDB + AOF hybrid recommended)
- Design for high availability with clustering and replication
- Handle cache problems proactively with proper expiration and fallback strategies
- Use distributed locks carefully with proper timeout and renewal mechanisms
- Monitor and optimize performance regularly