Redis Deployment Modes: Theory, Practice, and Interview Insights
Introduction
Redis supports multiple deployment modes, each designed for different use cases, scalability requirements, and availability needs. Understanding these modes is crucial for designing robust, scalable systems.
π― Common Interview Question: βHow do you decide which Redis deployment mode to use for a given application?β
Answer Framework: Consider these factors:
- Data size: Single instance practical limits (~25GB operational recommendation)
- Availability requirements: RTO/RPO expectations
- Read/write patterns: Read-heavy vs write-heavy workloads
- Geographic distribution: Single vs multi-region
- Operational complexity: Team expertise and maintenance overhead
Standalone Redis
Overview
Standalone Redis is the simplest deployment mode where a single Redis instance handles all operations. Itβs ideal for development, testing, and small-scale applications.
Architecture
graph TB
A[Client Applications] --> B[Redis Instance]
B --> C[Disk Storage]
style B fill:#ff9999
style A fill:#99ccff
style C fill:#99ff99
Configuration Example
1 | # redis.conf for standalone |
Best Practices
Memory Management
- Set
maxmemory
to 75% of available RAM - Choose appropriate eviction policy based on use case
- Monitor memory fragmentation ratio
- Set
Persistence Configuration
- Use AOF for critical data (better durability)
- RDB for faster restarts and backups
- Consider hybrid persistence for optimal balance
Security
- Enable AUTH with strong passwords
- Use TLS for client connections
- Bind to specific interfaces, avoid 0.0.0.0
Limitations and Use Cases
Limitations:
- Single point of failure
- Limited by single machine resources
- No automatic failover
Optimal Use Cases:
- Development and testing environments
- Applications with < 25GB data (to avoid RDB performance impact)
- Non-critical applications where downtime is acceptable
- Cache-only scenarios with acceptable data loss
π― Interview Insight: βWhen would you NOT use standalone Redis?β
Answer: When you need high availability (>99.9% uptime), data sizes exceed 25GB (RDB operations impact performance), or when application criticality requires zero data loss guarantees.
RDB Operation Impact Analysis
Critical Production Insight: The 25GB threshold is where RDB operations start significantly impacting online business:
graph LR
A[BGSAVE Command] --> B["fork() syscall"]
B --> C[Copy-on-Write Memory]
C --> D[Memory Usage Spike]
D --> E[Potential OOM]
F[Write Operations] --> G[COW Page Copies]
G --> H[Increased Latency]
H --> I[Client Timeouts]
style D fill:#ff9999
style E fill:#ff6666
style H fill:#ffcc99
style I fill:#ff9999
Real-world Impact at 25GB+:
- Memory spike: Up to 2x memory usage during fork
- Latency impact: P99 latencies can spike from 1ms to 100ms+
- CPU impact: Fork operation can freeze Redis for 100ms-1s
- I/O saturation: Large RDB writes competing with normal operations
Mitigation Strategies:
- Disable automatic RDB: Use
save ""
and only manual BGSAVE during low traffic - AOF-only persistence: More predictable performance impact
- Slave-based backups: Perform RDB operations on slave instances
- Memory optimization: Use compression, optimize data structures
Redis Replication (Master-Slave)
Overview
Redis replication creates exact copies of the master instance on one or more slave instances. It provides read scalability and basic redundancy.
Architecture
graph TB
A[Client - Writes] --> B[Redis Master]
C[Client - Reads] --> D[Redis Slave 1]
E[Client - Reads] --> F[Redis Slave 2]
B --> D
B --> F
B --> G[Disk Storage Master]
D --> H[Disk Storage Slave 1]
F --> I[Disk Storage Slave 2]
style B fill:#ff9999
style D fill:#ffcc99
style F fill:#ffcc99
Configuration
Master Configuration:
1 | # master.conf |
Slave Configuration:
1 | # slave.conf |
Replication Process Flow
sequenceDiagram
participant M as Master
participant S as Slave
participant C as Client
Note over S: Initial Connection
S->>M: PSYNC replicationid offset
M->>S: +FULLRESYNC runid offset
M->>S: RDB snapshot
Note over S: Load RDB data
M->>S: Replication backlog commands
Note over M,S: Ongoing Replication
C->>M: SET key value
M->>S: SET key value
C->>S: GET key
S->>C: value
Best Practices
Network Optimization
- Use
repl-diskless-sync yes
for fast networks - Configure
repl-backlog-size
based on network latency - Monitor replication lag with
INFO replication
- Use
Slave Configuration
- Set
slave-read-only yes
to prevent accidental writes - Use
slave-priority
for failover preferences - Configure appropriate
slave-serve-stale-data
behavior
- Set
Monitoring Key Metrics
- Replication offset difference
- Last successful sync time
- Number of connected slaves
Production Showcase
1 |
|
π― Interview Question: βHow do you handle slave promotion in a master-slave setup?β
Answer: Manual promotion involves:
- Stop writes to current master
- Ensure slave is caught up (
LASTSAVE
comparison) - Execute
SLAVEOF NO ONE
on chosen slave - Update application configuration to point to new master
- Configure other slaves to replicate from new master
Limitation: No automatic failover - requires manual intervention or external tooling.
Redis Sentinel
Overview
Redis Sentinel provides high availability for Redis through automatic failover, monitoring, and configuration management. Itβs the recommended solution for automatic failover in non-clustered environments.
Architecture
graph TB
subgraph "Redis Instances"
M[Redis Master]
S1[Redis Slave 1]
S2[Redis Slave 2]
end
subgraph "Sentinel Cluster"
SE1[Sentinel 1]
SE2[Sentinel 2]
SE3[Sentinel 3]
end
subgraph "Applications"
A1[App Instance 1]
A2[App Instance 2]
end
M --> S1
M --> S2
SE1 -.-> M
SE1 -.-> S1
SE1 -.-> S2
SE2 -.-> M
SE2 -.-> S1
SE2 -.-> S2
SE3 -.-> M
SE3 -.-> S1
SE3 -.-> S2
A1 --> SE1
A2 --> SE2
style M fill:#ff9999
style S1 fill:#ffcc99
style S2 fill:#ffcc99
style SE1 fill:#99ccff
style SE2 fill:#99ccff
style SE3 fill:#99ccff
Sentinel Configuration
1 | # sentinel.conf |
Failover Process
sequenceDiagram
participant S1 as Sentinel 1
participant S2 as Sentinel 2
participant S3 as Sentinel 3
participant M as Master
participant SL as Slave
participant A as Application
Note over S1,S3: Normal Monitoring
S1->>M: PING
M--xS1: No Response
S1->>S2: Master seems down
S1->>S3: Master seems down
Note over S1,S3: Quorum Check
S2->>M: PING
M--xS2: No Response
S3->>M: PING
M--xS3: No Response
Note over S1,S3: Failover Decision
S1->>S2: Start failover?
S2->>S1: Agreed
S1->>SL: SLAVEOF NO ONE
S1->>A: New master notification
Best Practices
Quorum Configuration
- Use odd number of sentinels (3, 5, 7)
- Set quorum to majority (e.g., 2 for 3 sentinels)
- Deploy sentinels across different failure domains
Timing Parameters
down-after-milliseconds
: 5-30 seconds based on network conditionsfailover-timeout
: 2-3x down-after-millisecondsparallel-syncs
: Usually 1 to avoid overwhelming new master
Client Integration
1 | import redis.sentinel |
Production Monitoring Script
1 |
|
π― Interview Question: βHow does Redis Sentinel handle split-brain scenarios?β
Answer: Sentinel prevents split-brain through:
- Quorum requirement: Only majority can initiate failover
- Epoch mechanism: Each failover gets unique epoch number
- Leader election: Only one sentinel leads failover process
- Configuration propagation: All sentinels must agree on new configuration
Key Point: Even if network partitions occur, only the partition with quorum majority can perform failover, preventing multiple masters.
Redis Cluster
Overview
Redis Cluster provides horizontal scaling and high availability through data sharding across multiple nodes. Itβs designed for applications requiring both high performance and large data sets.
Architecture
graph TB
subgraph "Redis Cluster"
subgraph "Shard 1"
M1[Master 1<br/>Slots 0-5460]
S1[Slave 1]
end
subgraph "Shard 2"
M2[Master 2<br/>Slots 5461-10922]
S2[Slave 2]
end
subgraph "Shard 3"
M3[Master 3<br/>Slots 10923-16383]
S3[Slave 3]
end
end
M1 --> S1
M2 --> S2
M3 --> S3
M1 -.-> M2
M1 -.-> M3
M2 -.-> M3
A[Application] --> M1
A --> M2
A --> M3
style M1 fill:#ff9999
style M2 fill:#ff9999
style M3 fill:#ff9999
style S1 fill:#ffcc99
style S2 fill:#ffcc99
style S3 fill:#ffcc99
Hash Slot Distribution
Redis Cluster uses consistent hashing with 16,384 slots:
graph LR
A[Key] --> B[CRC16]
B --> C[% 16384]
C --> D[Hash Slot]
D --> E[Node Assignment]
F[Example: user:1000] --> G[CRC16 = 31949]
G --> H[31949 % 16384 = 15565]
H --> I[Slot 15565 β Node 3]
Cluster Configuration
Node Configuration:
1 | # cluster-node.conf |
Cluster Setup Script:
1 |
|
Data Distribution and Client Routing
sequenceDiagram
participant C as Client
participant N1 as Node 1
participant N2 as Node 2
participant N3 as Node 3
C->>N1: GET user:1000
Note over N1: Check slot ownership
alt Key belongs to N1
N1->>C: value
else Key belongs to N2
N1->>C: MOVED 15565 192.168.1.102:7001
C->>N2: GET user:1000
N2->>C: value
end
Advanced Operations
Resharding Example:
1 | # Move 1000 slots from node 1 to node 4 |
Adding New Nodes:
1 | # Add new master |
Client Implementation Best Practices
1 | import redis.cluster |
Limitations and Considerations
- Multi-key Operations: Limited to same hash slot
- Lua Scripts: All keys must be in same slot
- Database Selection: Only database 0 supported
- Client Complexity: Requires cluster-aware clients
π― Interview Question: βHow do you handle hotspot keys in Redis Cluster?β
Answer Strategies:
- Hash tags: Distribute related hot keys across slots
- Client-side caching: Cache frequently accessed data
- Read replicas: Use slave nodes for read operations
- Application-level sharding: Pre-shard at application layer
- Monitoring: Use
redis-cli --hotkeys
to identify patterns
Deployment Architecture Comparison
Feature Matrix
Feature | Standalone | Replication | Sentinel | Cluster |
---|---|---|---|---|
High Availability | β | β | β | β |
Automatic Failover | β | β | β | β |
Horizontal Scaling | β | β | β | β |
Read Scaling | β | β | β | β |
Operational Complexity | Low | Low | Medium | High |
Multi-key Operations | β | β | β | Limited |
Max Data Size | Single Node | Single Node | Single Node | Multi-Node |
Decision Flow Chart
flowchart TD
A[Start: Redis Deployment Decision] --> B{Data Size > 25GB?}
B -->|Yes| C{Can tolerate RDB impact?}
C -->|No| D[Consider Redis Cluster]
C -->|Yes| E{High Availability Required?}
B -->|No| E
E -->|No| F{Read Scaling Needed?}
F -->|Yes| G[Master-Slave Replication]
F -->|No| H[Standalone Redis]
E -->|Yes| I{Automatic Failover Needed?}
I -->|Yes| J[Redis Sentinel]
I -->|No| G
style D fill:#ff6b6b
style J fill:#4ecdc4
style G fill:#45b7d1
style H fill:#96ceb4
Production Considerations
Hardware Sizing Guidelines
CPU Requirements:
- Standalone/Replication: 2-4 cores
- Sentinel: 1-2 cores per sentinel
- Cluster: 4-8 cores per node
Memory Guidelines:
1 | Total RAM = (Dataset Size Γ 1.5) + OS overhead |
Network Considerations:
- Replication: 1Gbps minimum for large datasets
- Cluster: Low latency (<1ms) between nodes
- Client connections: Plan for connection pooling
Security Best Practices
1 | # Production security configuration |
Backup and Recovery Strategy
1 |
|
Monitoring and Operations
Key Performance Metrics
1 |
|
Alerting Thresholds
Metric | Warning | Critical |
---|---|---|
Memory Usage | >80% | >90% |
Hit Ratio | <90% | <80% |
Connected Clients | >80% max | >95% max |
Replication Lag | >10s | >30s |
Cluster State | degraded | fail |
Troubleshooting Common Issues
Memory Fragmentation:
1 | # Check fragmentation ratio |
Slow Queries:
1 | # Enable slow log |
π― Interview Question: βHow do you handle Redis memory pressure in production?β
Comprehensive Answer:
- Immediate actions: Check
maxmemory-policy
, verify no memory leaks - Short-term: Scale vertically, optimize data structures, enable compression
- Long-term: Implement data archiving, consider clustering, optimize application usage patterns
- Monitoring: Set up alerts for memory usage, track key expiration patterns
Conclusion
Choosing the right Redis deployment mode depends on your specific requirements for availability, scalability, and operational complexity. Start simple with standalone or replication for smaller applications, progress to Sentinel for high availability needs, and adopt Cluster for large-scale, horizontally distributed systems.
Final Interview Insight: The key to Redis success in production is not just choosing the right deployment mode, but also implementing proper monitoring, backup strategies, and operational procedures. Always plan for failure scenarios and test your disaster recovery procedures regularly.
Remember: βThe best Redis deployment is the simplest one that meets your requirements.β