MVCC is a concurrency control method that allows multiple transactions to access the same data simultaneously without blocking each other. Instead of using locks for reads, MVCC maintains multiple versions of data and shows each transaction a consistent snapshot based on when the transaction started.
Why MVCC Matters
Traditional Locking Problems
Without MVCC, databases face the readers-writers problem:
Readers block writers: Transactions reading data prevent others from modifying it
Writers block readers: Transactions modifying data prevent others from reading it
Performance bottleneck: High contention leads to poor concurrency
MVCC Benefits
Non-blocking reads: Readers never block writers and vice versa
Consistent snapshots: Each transaction sees a consistent view of data
Higher concurrency: Multiple transactions can work simultaneously
ACID compliance: Maintains isolation without sacrificing performance
Core MVCC Components
Hidden Columns in InnoDB
Every InnoDB table row contains hidden system columns:
1 2 3
| User Data | DB_TRX_ID | DB_ROLL_PTR | DB_ROW_ID | |-----------|-----------|-------------|-----------| | name, age | 12345 | 0x8A2B... | 67890 |
DB_TRX_ID (Transaction ID)
Size: 6 bytes
Purpose: Identifies which transaction last modified this row
Behavior: Updated every time a row is inserted or updated
| Type: INSERT | Table ID | Primary Key Values | Transaction ID |
Purpose: Rolling back INSERT operations
Content: Only primary key needed (for deletion)
Cleanup: Purged immediately after transaction commits
UPDATE Undo Log
1
| Type: UPDATE | Table ID | Primary Key | Changed Columns | Old Values | roll_ptr |
Purpose: Rolling back UPDATE operations and MVCC reads
Content: Original values of modified columns
Cleanup: Purged when no active transaction needs this version
DELETE Undo Log
1
| Type: DELETE | Table ID | Complete Row Data | roll_ptr |
Purpose: Rolling back DELETE operations
Content: Entire row data
Behavior: Row is marked as deleted but not physically removed
Read View Mechanism
Read View Structure
A Read View is a snapshot of active transactions at a specific point in time:
1 2 3 4 5 6
structReadView { trx_id_t m_low_limit_id; // Highest TRX_ID + 1 at creation time trx_id_t m_up_limit_id; // Lowest active TRX_ID at creation time trx_list_t m_ids; // List of active transaction IDs trx_id_t m_creator_trx_id; // Transaction ID that created this view };
Read View Fields Explained
m_low_limit_id (High Water Mark)
Definition: Next transaction ID to be assigned
Rule: Any TRX_ID ≥ m_low_limit_id is invisible (not yet started)
m_up_limit_id (Low Water Mark)
Definition: Smallest active transaction ID when Read View was created
Rule: Any TRX_ID < m_up_limit_id is visible (committed before snapshot)
m_ids (Active Transaction List)
Definition: List of all active (uncommitted) transaction IDs
Rule: Any TRX_ID in this list is invisible (uncommitted)
m_creator_trx_id
Definition: ID of the transaction that created this Read View
Rule: Changes made by this transaction are always visible to itself
Visibility Algorithm
For each row version, MVCC determines visibility using this logic:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
defis_visible(row_trx_id, read_view): # Rule 1: Own changes are always visible if row_trx_id == read_view.m_creator_trx_id: returnTrue # Rule 2: Future transactions are invisible if row_trx_id >= read_view.m_low_limit_id: returnFalse # Rule 3: Very old transactions are visible if row_trx_id < read_view.m_up_limit_id: returnTrue # Rule 4: Check if transaction was active if row_trx_id in read_view.m_ids: returnFalse# Was active, so invisible else: returnTrue# Was committed, so visible
Read View Creation: New Read View for every SELECT statement
Behavior: Sees all changes committed before each individual statement
Result: Can see different data within the same transaction (non-repeatable reads)
1 2 3 4 5 6 7 8
-- Transaction A START TRANSACTION; SELECT age FROM users WHERE name ='Alice'; -- Returns 25
-- Transaction B commits: UPDATE users SET age = 26 WHERE name = 'Alice';
SELECT age FROM users WHERE name ='Alice'; -- Returns 26 (different result!) COMMIT;
REPEATABLE READ
Read View Creation: Single Read View at first SELECT statement
Behavior: Consistent snapshot throughout the entire transaction
Result: Same data for all reads within the transaction
1 2 3 4 5 6 7 8
-- Transaction A START TRANSACTION; SELECT age FROM users WHERE name ='Alice'; -- Returns 25, creates Read View
-- Transaction B commits: UPDATE users SET age = 26 WHERE name = 'Alice';
SELECT age FROM users WHERE name ='Alice'; -- Still returns 25 (consistent!) COMMIT;
MVCC Read Process (Step by Step)
When a SELECT Statement Executes:
Step 1: Create or Reuse Read View
1
SELECT name, age FROM users WHERE user_id =1;
READ COMMITTED: Create new Read View
REPEATABLE READ: Use existing Read View or create if first read
Step 2: Locate Current Row Version
Use index or table scan to find the row
Current row has latest TRX_ID and ROLL_PTR
Step 3: Apply Visibility Rules
Check if current version is visible using Read View
If visible, return this version
If not visible, follow the version chain
Step 4: Traverse Version Chain
1 2 3 4 5
Current Row (TRX_ID: 105) → Not visible ↓ (follow ROLL_PTR) Version in Undo (TRX_ID: 103) → Not visible ↓ (follow roll_ptr) Version in Undo (TRX_ID: 101) → Visible! Return this version
Step 5: Return Appropriate Version
Return the first visible version found
If no visible version exists, row doesn’t exist for this transaction
MVCC Write Operations
INSERT Operations
Create new row with current transaction’s TRX_ID
No undo log needed for MVCC (only for rollback)
Row immediately visible to the inserting transaction
Invisible to others until transaction commits
UPDATE Operations
Create undo log entry with original values
Update current row with new values and TRX_ID
Link to previous version via ROLL_PTR
Original version remains accessible via undo log
DELETE Operations
Mark row as deleted (set delete flag)
Create undo log entry with complete row data
Row remains physically present but marked deleted
Appears deleted to new transactions but still visible to older ones
Purge Process
Why Purge is Needed
Undo logs grow indefinitely without cleanup
Old versions become unnecessary when no transaction needs them
Storage space must be reclaimed
Purge Thread Operation
Identify purgeable versions: No active transaction needs them
Remove undo log entries: Free up undo tablespace
Physical row deletion: Remove rows marked for deletion
Index cleanup: Remove deleted entries from secondary indexes
Purge Lag Issues
When purge falls behind:
Undo tablespace growth: Disk space consumption increases
Version chain length: Longer chains slow down reads
Memory pressure: More versions kept in buffer pool
Performance Implications
MVCC Benefits
High concurrency: No read-write blocking
Consistent reads: Snapshot isolation without locks
Predictable performance: No lock contention delays
MVCC Costs
Storage overhead: Multiple versions consume space
Version traversal: Long chains increase read latency
Purge overhead: Background cleanup uses resources
Undo log I/O: Additional disk operations for version chains
Optimization Strategies
Monitor purge lag: Ensure purge keeps up with modifications
Tune undo tablespace: Size appropriately for workload
Minimize long transactions: Reduce version chain lengths
Index optimization: Reduce version traversal overhead
Common MVCC Scenarios
Phantom Reads Prevention
1 2 3 4 5 6 7 8 9 10 11
-- Transaction 1 (REPEATABLE READ) START TRANSACTION; SELECTCOUNT(*) FROM orders WHERE amount >1000; -- Returns 5
-- Transaction 2 inserts new row INSERT INTO orders (amount) VALUES (1500); COMMIT;
-- Transaction 1 continues SELECTCOUNT(*) FROM orders WHERE amount >1000; -- Still returns 5 COMMIT;
Consistent Backup
1 2 3 4 5
-- Long-running backup transaction START TRANSACTION WITH CONSISTENT SNAPSHOT; -- Takes hours to complete, but sees consistent point-in-time data mysqldump --single-transaction ... COMMIT;
-- Reader continues with original snapshot SELECT*FROM accounts WHERE account_id =1; -- Still sees original balance COMMIT;
This comprehensive understanding of MVCC explains how MySQL achieves high concurrency while maintaining data consistency, making it essential knowledge for database administrators and developers working with high-performance applications.