
Concurrency Control Mechanism in Databases
Concurrency control is an important characteristic of database management systems. It ensures that multiple users can access and modify the database simultaneously without compromising data integrity.
In this article, we’ll answer the question how do databases handle concurrency?
Key Concepts
Concurrency
Concurrency is the ability of a database to handle multiple operations at the same time. This is essential in environments where multiple users or applications need to read and write data concurrently.
Transactions
A transaction is a sequence of database operations that are executed as a single unit. Transactions must be ACID compliant:
- Atomicity: All operations within a transaction are completed successfully or none are.
- Consistency: The database remains in a consistent state before and after the transaction.
- Isolation: Transactions are executed independently of each other.
- Durability: Once a transaction is committed, it remains so, even in the event of a system failure.
Concurrency Control Mechanisms
Databases use several mechanisms to manage concurrency. The are the common ones:
Locking
Locks are used to control access to data. There are two main types of locks:
- Shared Locks: These locks allow multiple transactions to read a resource but not modify it.
- Exclusive Locks: These locks allow a single transaction to both read and modify a resource.
Properties of Locks
Lock Type | Description |
---|---|
Read Lock | Multiple transactions can read but not write. |
Write Lock | Only one transaction can write; others can’t read or write. |
Lock Granularity
Locks can be applied at different levels:
- Row-level locking: Locks a single row in a table.
- Page-level locking: Locks a data page containing multiple rows.
- Table-level locking: Locks an entire table.
Advantages of lock granularity:
- Row-level locks provide high concurrency but can be costly.
- Table-level locks are less costly but reduce concurrency.

Optimistic Concurrency Control
Optimistic Concurrency Control (OCC) is a concurrency control method used in databases to manage simultaneous transactions.
Unlike pessimistic concurrency control, which locks resources preemptively to avoid conflicts, OCC operates on the assumption that conflicts are rare and handles them when they occur.
This method is particularly beneficial in environments with low contention, where the likelihood of concurrent transactions interfering with each other is minimal.
Key Concepts
- Assumption of Low Contention:
- OCC is based on the premise that most database transactions do not conflict with each other. Therefore, it allows transactions to proceed without locking resources upfront.
- Phases of OCC: OCC typically involves three main phases:
- Read Phase: The transaction reads the required data and performs computations and make changes in a local (temporary) workspace.
- Validation Phase: Before committing, the transaction checks whether the data it read has been modified by other transactions since the read phase began.
- Write Phase: If the validation succeeds, the transaction writes its changes to the database. If validation fails, the transaction is rolled back and may be retried.
Detailed Steps of OCC
- Begin Transaction:
- A transaction starts and gets a unique timestamp to keep track of its operations.
- Read Phase:
- The transaction reads the necessary data items from the database. Instead of locking these items, it copies them to a local workspace where it performs the required operations.
- During this phase, no changes are made to the actual database, and no locks are held on the data items.
- Validation Phase:
- When the transaction is ready to commit, it enters the validation phase. The system checks if any of the data items read by the transaction have been modified by other transactions since the read phase began.
- The validation process typically involves checking timestamps or version numbers associated with the data items.
- Write Phase:
- If the validation phase confirms that there are no conflicts, the transaction proceeds to the write phase, where it applies its changes to the database.
- If the validation detects conflicts (i.e., other transactions have modified the data items), the transaction is aborted and rolled back. The system may retry the transaction or notify the user of the conflict.
Advantages and Disadvantages
Advantages
- High Performance: OCC allows high levels of concurrency because transactions do not lock resources during the read phase. This reduces waiting time and potential bottlenecks.
- Scalability: Since OCC minimizes locking, it scales well in environments with a large number of read operations and relatively fewer write operations.
- Simplicity: Implementing OCC can be simpler than managing complex locking mechanisms, especially in distributed systems.
Disadvantages
- Conflict Overhead: In high contention environments where many transactions compete for the same data, the likelihood of conflicts increases. This can lead to frequent rollbacks, reducing overall performance.
- Validation Cost: The validation phase can be computationally expensive, especially if the system needs to check many data items for conflicts.
- Retry Complexity: Handling transaction retries after conflicts can be complex, particularly in systems with many interdependent transactions.
Example Scenario
Consider a ticket booking system where multiple users are trying to book seats for a concert simultaneously:
- User A starts a transaction to book a seat. During the read phase, User A sees that Seat 10 is available.
- User B also starts a transaction to book the same seat shortly after User A. User B also sees that Seat 10 is available.
- User A proceeds to the validation phase and finds that no other transactions have modified the availability of Seat 10. User A’s transaction commits, and Seat 10 is marked as booked.
- User B reaches the validation phase and detects that Seat 10 has been booked by User A’s transaction. User B’s transaction is rolled back, and User B is informed that the seat is no longer available.
Conflict Detection Strategies
- Timestamp Validation:
- Each transaction has a timestamp. During validation, the system checks if any data items read by the transaction have timestamps indicating modification by other transactions.
- Version Numbers:
- Data items are associated with version numbers. When a transaction reads a data item, it notes the version number. During validation, the transaction checks if the version number has changed.
- Read-Write Sets:
- The system maintains sets of data items read and written by each transaction. During validation, it checks for overlaps between the write sets of committed transactions and the read set of the current transaction.
Pessimistic Concurrency Control
Pessimistic concurrency control assumes that conflicts are likely. Transactions lock resources before accessing them and ensure that no other transactions can modify them until the lock is released.
Steps in Pessimistic Concurrency Control
- Begin Transaction: The transaction starts, and the system prepares to manage access to the necessary resources.
- Lock Resource:
- Write Operation: If the transaction intends to modify a resource, it places an exclusive lock on that resource.
- Read Operation: If the transaction only needs to read a resource, it may place a shared lock if other transactions only need to read as well.
- Perform Operations: The transaction reads or writes data as needed. The exclusive lock ensures that no other transaction can read or modify the data during this time.
- Validation: As the transaction proceeds, the system checks to ensure that the operations conform to the rules and that no conflicts have arisen.
- Commit/Rollback:
- Commit: If all operations are successful and valid, the transaction commits, and the changes are made permanent.
- Rollback: If there is an issue (e.g., a conflict or error), the transaction rolls back, and all changes made during the transaction are undone.
- Release Locks: Once the transaction is complete (committed or rolled back), all locks are released, making the resources available for other transactions.
Advantages and Disadvantages
Advantages
- Data Integrity: By locking resources, pessimistic concurrency control ensures data consistency and integrity, preventing conflicts and maintaining a reliable state.
- Simplicity: The approach is straightforward to implement and understand, making it easier to manage in complex systems.
Disadvantages
- Reduced Concurrency: Locking resources can significantly reduce the level of concurrency, as transactions must wait for locks to be released before proceeding. This can lead to bottlenecks, especially in high-transaction environments.
- Potential for Deadlocks: When multiple transactions lock resources, there’s a risk of deadlocks, where two or more transactions wait indefinitely for each other to release locks. Deadlock detection and resolution mechanisms are required to handle this issue.
- Performance Overhead: The constant locking and unlocking of resources introduce performance overhead, potentially slowing down the overall system.
Deadlock Handling
Deadlocks are a critical issue in pessimistic concurrency control. Here’s how they are typically managed:
- Deadlock Detection: The system periodically checks for cycles in the wait-for graph (a representation of which transactions are waiting for which resources).
- Deadlock Resolution:
- Timeouts: Transactions that wait for too long are aborted.
- Deadlock Prevention Algorithms: Techniques like Wait-Die (older transactions wait, younger transactions are aborted) or Wound-Wait (older transactions abort younger ones) are used to prevent deadlocks from occurring.
Example Scenario
Consider a banking system where two transactions are trying to update the balance of the same account:
- Transaction A wants to transfer money from Account X to Account Y.
- Transaction B wants to transfer money from Account X to Account Z at the same time.
- Transaction A acquires an exclusive lock on Account X to read and update its balance.
- Transaction B attempts to acquire an exclusive lock on Account X but is forced to wait because Transaction A holds the lock.
In this case, Transaction B cannot proceed until Transaction A completes its operation and releases the lock on Account X, ensuring that both transactions do not interfere with each other and that Account X’s balance remains consistent and accurate.
By understanding and implementing pessimistic concurrency control, database systems can maintain data integrity and consistency in environments where the likelihood of transaction conflicts is high, despite the potential trade-offs in performance and concurrency.
Multi-version Concurrency Control (MVCC)
Multi-version Concurrency Control (MVCC) is a method used by database management systems to handle concurrent transactions without the need for strict locking mechanisms. It allows multiple versions of a data item to exist simultaneously, providing high concurrency and minimizing conflicts.
Key Concepts
- Versioning:
- Every transaction sees a consistent snapshot of the database at a particular point in time. Each write operation creates a new version of a data item rather than overwriting the existing one.
- Versions are timestamped or associated with transaction identifiers to keep track of their creation times.
- Snapshots:
- Transactions operate on snapshots of the database, which represent the state of the database at the start of the transaction. This ensures that read operations do not block write operations and vice versa.
- Visibility Rules:
- A version of a data item is visible to a transaction if it was created before the transaction began and not modified by any uncommitted transactions.
How MVCC Works
- Begin Transaction:
- When a transaction starts, it gets a unique timestamp or transaction ID. It sees a snapshot of the database as it was at that point in time.
- Read Operations:
- When a transaction reads a data item, it reads the version that was the most recent at the time the transaction started.
- This versioning ensures that read operations do not block write operations, providing a consistent view of the data.
- Write Operations:
- When a transaction writes to a data item, it creates a new version with the current transaction’s timestamp.
- The new version is not visible to other transactions until the transaction commits.
- Commit/Rollback:
- Commit: When a transaction commits, its changes become visible to other transactions. The new versions of the data items it created are now part of the latest snapshot.
- Rollback: If a transaction rolls back, the versions of the data items it created are discarded, and the database remains unchanged as per the last committed state.
Advantages and Disadvantages
Advantages
- High Concurrency: MVCC allows multiple transactions to read and write concurrently without locking, which significantly improves performance in high-transaction environments.
- Reduced Lock Contention: Since read operations do not block write operations, and vice versa, there is less contention and fewer bottlenecks.
- Consistent Snapshots: Transactions see a consistent snapshot of the database, ensuring that reads are stable and predictable.
Disadvantages
- Increased Storage Requirements: Maintaining multiple versions of data items can consume more storage space. This requires efficient garbage collection mechanisms to manage obsolete versions.
- Complexity in Implementation: MVCC is more complex to implement compared to traditional locking mechanisms, requiring careful management of versions and timestamps.
- Potential for Version Skew: If not managed properly, the existence of multiple versions can lead to situations where different transactions see different versions of the same data item, which can complicate application logic.
Example Scenario
Consider an online shopping platform where multiple users are browsing and purchasing items simultaneously:
- User A starts a transaction to purchase an item. The transaction reads the inventory count, which shows 10 items available.
- User B starts a transaction at the same time to also purchase the item. User B’s transaction reads the same inventory count of 10.
- User A commits the purchase and reduces the inventory count to 9. A new version of the inventory count is created with a timestamp reflecting this change.
- User B continues with the purchase based on the snapshot seen at the start of the transaction, which still shows an inventory count of 10.
- User B commits the purchase and reduces the inventory count from 9 to 8. Another new version of the inventory count is created.
Throughout this process, both users see a consistent view of the inventory count at the time they started their transactions, and their operations do not block each other, thanks to MVCC.
Version Management
To efficiently manage the multiple versions of data items, MVCC relies on a garbage collection mechanism that periodically cleans up obsolete versions. This process involves:
- Tracking Versions: Keeping track of which versions are still visible to active transactions and which can be safely discarded.
- Marking Obsolete Versions: Identifying versions that are no longer needed once all transactions that could see them have finished.
- Cleaning Up: Physically removing the obsolete versions from the database to free up storage space.
Deadlocks
Deadlocks occur when two or more transactions are waiting for each other to release locks, resulting in a standstill.
Deadlock Detection and Resolution
Detection:
- The database periodically checks for cycles in the wait-for graph (a representation of transactions waiting for locks).
Resolution:
- Timeouts: Abort transactions that have been waiting too long.
- Deadlock Prevention: Using algorithms like Wait-Die or Wound-Wait to prevent deadlocks before they occur.
Handling concurrency in databases is complex but essential for data integrity and performance in multi-user environments.
By using techniques such as locking, optimistic and pessimistic concurrency control, and MVCC, databases can efficiently manage simultaneous operations.
Understanding these mechanisms helps in designing robust and efficient database applications.