PostgreSQL
Concurrency

Getting Started with Multiversion Concurrency Control (MVCC) in PostgreSQL

Author: Leslie S. Gyamfi
Length: 8 MINS
Type: Guide
Published: 2023-03-10
Intro
As the amount of data in databases continues to grow, so does the need for efficient and effective concurrency control techniques. There are different concurrency control techniques including Timestamp-based concurrency control, Lock-based concurrency control, and Multiversion Concurrency Control (MVCC.) MVCC is a widely used approach, supported by databases like PostgreSQL, MySQL with InnoDB, Oracle, Firebird, Microsoft SQL (optional, disabled by default), and IBM DB2 (partial.) In this blog, we tell you everything you need to know about Multiversion Concurrency Control (MVCC), or Multiversion Technique in PostgreSQL and how it works.
Tools used in the tutorial
Tool
Description
DbVisualizer
Top rated database management tool and sql client
PostgreSQL
The PostgreSQL database

Introduction to MVCC in PostgreSQL

MVCC is one of the magic pieces that makes Postgres and other relational databases what they are. It is a key aspect of PostgreSQL's framework and is used to implement transaction partitioning and concurrency control. As such, it is important for developers and database administrators to have a strong understanding of how MVCC works and how it can impact database performance and scalability.

A database user has the option of reading from or writing to a database, or doing both. However, a problem while creating (updating) databases is that our databases lock the data causing concurrency issues and making it challenging for readers to view the updated data. The readers might only see a partially updated version of the data or a version with errors. These concurrent problems could result in deadlocks and unsuccessful transactions. PostgreSQL uses Multi-Version Concurrency Control to control access to data in a database and ensures consistency while allowing multiple transactions to happen simultaneously in the database. In simple terms, MVCC creates several versions of a single database record, enabling various transactions to access different versions of one database record without conflicting with one another. This allows several transactions to run simultaneously, and eliminates the need for locks or blocking operations to enhance the database's performance and responsiveness.

The Behavior of MVCC

The most important aspect in Multiversion Concurrency Control in terms of user-visible behavior is that “readers never block writers” and “writers never block readers.” This behavior is primarily controlled by snapshots which determine what a user can see in a database. A snapshot in MVCC is a virtual read-only, point-in-time copy of the data recorded at the start of each SQL statement in a READ COMMITTED transaction isolation mode, and at the transaction start in a SERIALIZABLE translation isolation mode. MVCC provides a consistent view of the data to each transaction, even as other transactions are modifying the data. Each transaction has its own snapshot, which represents the state of the database at the beginning of the transaction. The snapshot comprises all information about all the table rows (tuples) in the database and controls the table rows that are visible to the transaction at the time the snapshot is taken.

When a new snapshot is taken, the following information is gathered:

  • the highest-numbered committed transaction
  • the transaction numbers currently executing

Using this information from a snapshot, Postgres can determine if a transaction’s actions should be visible to an executing statement.

How Does PostgreSQL Handle Concurrency?

MVCC allows multiple transactions to read and write to the database at the same time without interfering with each other while each transaction operates on a "snapshot" of the database which is a consistent view of the data at a specific time.

For an example illustrating how MVCC works in PostgreSQL, let’s assume we have a database and a table called Inventory within it - the table has the following columns:

  • Product ID (unique key)
  • Product Description
  • Amount(USD)
  • Customer Name
Product ID Product Description Amount (USD) Customer Name
100121 Apple Headphones 500.00 Raymond Ryan
100122 T-shirt 102.00 Raymond Ryan
100122 C-Type USB Cable 68.00 Harry Abubakar
100124 Pair of Spanner 121.00 Steven Snipes

Let’s say that two of the database transactions (Transaction 1 and Transaction 2) above are trying to update the amount of the product which has an ID of 100121 at the same time. The two transactions are taking place simultaneously so Transaction 1 begins.

If you are looking for an easy and powerful SQL client and database manager, then you've got to try DbVisualizer. It connects to nearly any database.

Creating a Snapshot of the Database Table

When Transaction 1 (T1) starts, it creates a snapshot of the "Inventory" table as shown below:

Product ID Product Description Amount (USD) Customer Name
100121 Apple Headphones 500.00 Raymond Ryan
100122 T-shirt 102.00 Raymond Ryan
100122 C-Type USB Cable 68.00 Harry Abubakar
100124 Pair of Spanner 121.00 Steven Snipes

Concurrent Transactions Creating Their Own Snapshots

Now, when Transaction 2 begins, it also creates its own snapshot of the database table as shown below:

Product ID Product Description Amount (USD) Customer Name
100121 Apple Headphones 500.00 Raymond Ryan
100122 T-shirt 102.00 Raymond Ryan
100122 C-Type USB Cable 68.00 Harry Abubakar
100124 Pair of Spanner 121.00 Steven Snipes

Utilizing Multiversion Concurrency Control

If the Transaction 1 tries to update the amount of product with an ID of 100121 to $550.00 and Transaction 2 tries to update the amount of the same product to $580.00 at the same time, PostgreSQL will utilize the technique of Multiversion Concurrency Control to create a new version of the record for each transaction as shown below:

Product ID Product Description Amount (USD) Customer Name
100121 Apple Headphones 550.00 Raymond Ryan
100121 Apple Headphones 500.00 Raymond Ryan
100122 T-shirt 102.00 Raymond Ryan
100122 C-Type USB Cable 68.00 Harry Abubakar
100124 Pair of Spanner 121.00 Steven Snipes
Product ID Product Description Amount (USD) Customer Name
100121 Apple Headphones 580.00 Raymond Ryan
100121 Apple Headphones 500.00 Raymond Ryan
100122 T-shirt 102.00 Raymond Ryan
100122 C-Type USB Cable 68.00 Harry Abubakar
100124 Pair of Spanner 121.00 Steven Snipes

Execution and Commit of Transactions

After T1 executes its updates, the new version of the database record with an amount of $550.00 becomes the current version.

Product ID Product Description Amount (USD) Customer Name
100121 Apple Headphones 550.00 Raymond Ryan
100122 T-shirt 102.00 Raymond Ryan
100122 C-Type USB Cable 68.00 Harry Abubakar
100124 Pair of Spanner 121.00 Steven Snipes

Now, when T2 tries to query the database for the row to commit changes, it sees the old value of 500.00 for the amount of product with an ID of 100121. This is because T2's snapshot of the data was taken at the time it started, which was before T1 committed its changes. PostgreSQL will recognize that the record it is trying to update has already been modified by T1 and hence cannot update it. Here, T2's transaction will create a serialization error, indicating that it conflicted with T1's transaction because T2’s snapshot is outdated.

Refreshing the Snapshot

T2 will have to refresh its snapshot by starting a new transaction and taking a new snapshot of the database which will include the row of the product with the ID of 100121 and a price of 550.00.

Creating a New Version of the Row with a New Transaction ID

T2 will now modify the row of the product with an ID of 100121 by changing the amount of the product from 550.00 to 580.00 and then commit its changes. PostgreSQL will then create a new version of the row with the new product amount value and an internal transaction ID of T2.

PostgreSQL MVCC Internal Process Flow

MVCC maintains multiple versions of each row in the database, so that different transactions can see different versions of the same data without blocking each other. Here, we'll explore the internal process flow of MVCC in PostgreSQL.

Suppose we have a table named inventory with columns product id, product description, amount and customer name:

1
CREATE TABLE inventory (
2
    product_id INT,
3
    product_description VARCHAR(255),
4
    amount DECIMAL(10,2),
5
    customer_name VARCHAR(255)
6
);

Here's how the MVCC behavior in PostgreSQL works for transactions T1 and T2:

Transaction T1

1
BEGIN TRANSACTION;
2
3
-- Take a snapshot of the database
4
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
5
6
-- Read the current version of row with ID 100121
7
SELECT amount FROM inventory WHERE product_id = 100121;
8
9
-- Modify the row with ID 100121
10
UPDATE inventory SET amount = 550 WHERE product_id = 100121;
11
12
-- Commit the transaction
13
COMMIT;
  • Transaction T1 begins by taking a snapshot of the database using the SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; statement. This tells PostgreSQL to use MVCC to ensure that T1 sees a consistent view of the data at the time it started.
  • T1 then reads the current version of the row with ID 100121 using the SELECT amount FROM inventory WHERE id = 100121; statement. T1 sees the current amount value of the product with ID 100121 at this point.
  • T1 now modifies the row with ID 100121 using the UPDATE inventory SET amount = 550 WHERE id = 100121; statement. This creates a new version of the row with a transaction ID of T1 and a new amount value of 550.
  • T1 then commits the transaction using the COMMIT; statement.
1
BEGIN TRANSACTION;
2
3
-- Take a snapshot of the database
4
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
5
6
-- Read the current version of row with ID 100121
7
SELECT amount FROM inventory WHERE product_id = 100121;
8
9
-- Modify the row with ID 100121
10
UPDATE inventory SET amount = 580 WHERE product_id = 100121;
11
12
-- Commit the transaction
13
COMMIT;
  • Transaction T2 begins by taking a snapshot of the database using the SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; statement. This tells PostgreSQL to use MVCC to ensure that T2 sees a consistent view of the data at the time it started.
  • T2 then reads the current version of the row with ID 100121 using the SELECT amount FROM inventory WHERE id = 100121; statement. However, here, T2 sees the version of the row that was current when T1 began, which has an amount value of 500.
  • T2 then modifies the row with ID 100121 using the UPDATE inventory SET amount = 580 WHERE id = 100121; statement. This creates a new version of the row with a transaction ID of T2 and a new amount value of 580.
  • T2 commits the transaction using the COMMIT; statement.

As a result, the inventory table will now have two versions of the row with ID 100121: one with an amount of 550 and a transaction ID of T1, and another with an amount of 580 and a transaction ID of T2. Each transaction sees a consistent view of the data at the time it started, and PostgreSQL ensures that transactions do not interfere with each other, even during modification of the same data.

Pros & Cons Of MVCC

Pros

  • MVCC allows multiple transactions to access the same database objects concurrently, without locking them. This can result in high concurrency and scalability and hence improve the performance of database applications in a multiuser environment.
  • It helps in the prediction of concurrent query behavior.
  • MVCC is more often faster than other concurrency control techniques, especially in circumstances with high read-to-write ratios.

Cons

  • When there are numerous transactions under processing, MVCC will require more memory as compared to other concurrency control techniques.
  • There is a risk of transaction ID wraparound. In PostgreSQL, MVCC uses transaction IDs to track committed updates to data. These transaction IDs are like counters that increase by one every time a new transaction is started. However, the IDs have a maximum value they can reach. If too many transactions happen and the ID counters reach their maximum value, they will reset to zero which can lead to errors.
DbVisualizer SQL Client.

Conclusion

In this article, you have learnt about the general idea of what MVCC is and how it functions in PostgreSQL by using snapshots to manage the visibility of transactions to avoid the need for locking. MVCC permits higher levels of concurrency, which reduces contention in a database application and improves performance. By understanding the principles and best practices of MVCC, developers can design more efficient and scalable applications that take full advantage of the power of PostgreSQL.

Make sure to follow the DbVisualizer Table blog to learn more about database development, and until next time.

About the author
Leslie S. Gyamfi.
Leslie S. Gyamfi
Leslie Gyamfi is a mobile/web app developer with a passion for creating innovative solutions. He is dedicated to delivering high-quality products and technical articles. You can connect with him on LinkedIn
SIGN UP TO RECEIVE THE TABLE'S ROUNDUP
More from the table
TITLE
AUTHOR
Gayatri Sachdeva
TAGS
DBA'S
DronaHQ
TITLE
AUTHOR
Lukas Vileikis
TAGS
MySQL
TITLE
AUTHOR
Bonnie
TAGS
Generation
TITLE
AUTHOR
Bonnie
TAGS
Joins
TITLE
AUTHOR
Igor Bobriakov
TAGS
MySQL
Security
TITLE
AUTHOR
Lukas Vileikis
TAGS
MySQL
Operators
TITLE
AUTHOR
Lukas Vileikis
TAGS
MySQL
NULL
TITLE
AUTHOR
Lukas Vileikis
TAGS
MySQL
Reserved words
TITLE
AUTHOR
Igor Bobriakov
TAGS
Oracle
TITLE
AUTHOR
Antonello Zanini
TAGS
DELETE
TITLE
AUTHOR
Antonello Zanini
TAGS
PostgreSQL
TITLE
AUTHOR
Antonello Zanini
TAGS
PostgreSQL
MySQL
TITLE
AUTHOR
Antonello Zanini
TAGS
PostgreSQL
JSON
TITLE
AUTHOR
Antonello Zanini
TAGS
PostgreSQL
Null
TITLE
AUTHOR
Antonello Zanini
TAGS
Driver
JDBC
ODBC
TITLE
AUTHOR
Antonello Zanini
TAGS
PostgreSQL
Connection
TITLE
AUTHOR
Lukas Vileikis
TAGS
Deduplication
TITLE
AUTHOR
Ochuko Onojakpor
TAGS
SQL
Functions
TITLE
AUTHOR
Lukas Vileikis
TAGS
MySQL
Math
TITLE
AUTHOR
Ochuko Onojakpor
TAGS
Docker
MySQL
TITLE
AUTHOR
Ochuko Onojakpor
TAGS
Views
TITLE
AUTHOR
Lukas Vileikis
TAGS
MySQL
Index
TITLE
AUTHOR
Bonnie
TAGS
BigQuery
TITLE
AUTHOR
Leslie S. Gyamfi
TAGS
Join
TITLE
AUTHOR
Antonello Zanini
TAGS
PostgreSQL
TITLE
AUTHOR
Leslie S. Gyamfi
TAGS
PostgreSQL
TITLE
AUTHOR
Ochuko Onojakpor
TAGS
PrestoDb
TITLE
AUTHOR
Ochuko Onojakpor
TAGS
Ansible
Automation
TITLE
AUTHOR
Antonello Zanini
TAGS
PostgreSQL
TITLE
AUTHOR
Leslie S. Gyamfi
TAGS
PostgreSQL
NoSQL
JSON
TITLE
AUTHOR
Igor Bobriakov
TAGS
Oracle
Data types
TITLE
AUTHOR
TheTable
TAGS
ElasticSearch
TITLE
AUTHOR
Ochuko Onojakpor
TAGS
Security
TITLE
AUTHOR
Lukas Vileikis
TAGS
Language
Design
TITLE
AUTHOR
Lukas Vileikis
TAGS
CRUD
DELETE
TITLE
AUTHOR
Lukas Vileikis
TAGS
CRUD
UPDATE
TITLE
AUTHOR
Lukas Vileikis
TAGS
CRUD
SELECT
TITLE
AUTHOR
Lukas Vileikis
TAGS
CRUD
INSERT
TITLE
AUTHOR
Lukas Vileikis
TAGS
MySQL
TITLE
AUTHOR
Leslie S. Gyamfi
TAGS
PostgreSQL
TITLE
AUTHOR
TheTable
TAGS
Bug
TITLE
AUTHOR
Lukas Vileikis
TAGS
MySQL
Daemon
TITLE
AUTHOR
Lukas Vileikis
TAGS
MySQL
Partitions
TITLE
AUTHOR
Leslie S. Gyamfi
TAGS
Migration
MySQL
PostgreSQL
TITLE
AUTHOR
Antonello Zanini
TAGS
PostgreSQL
hstore
TITLE
AUTHOR
TheTable
TAGS
SQL
TITLE
AUTHOR
Igor Bobriakov
TAGS
SQL server
Security
TITLE
AUTHOR
Ochuko Onojakpor
TAGS
Visualize
TITLE
AUTHOR
TheTable
TAGS
MySQL
TITLE
AUTHOR
Lukas Vileikis
TAGS
SQL
Security
TITLE
AUTHOR
Antonello Zanini
TAGS
PostgreSQL
TITLE
AUTHOR
TheTable
TAGS
PostgreSQL
Docker
TITLE
AUTHOR
Antonello Zanini
TAGS
PostgreSQL
TITLE
AUTHOR
Lukas Vileikis
TAGS
MySQL
Connection
TITLE
AUTHOR
Lukas Vileikis
TAGS
Performance
TITLE
AUTHOR
Lukas Vileikis
TAGS
Security
TITLE
AUTHOR
Antonello Zanini
TAGS
Columns
TITLE
AUTHOR
Ochuko Onojakpor
TAGS
Performance
TITLE
AUTHOR
Ochuko Onojakpor
TAGS
SQL
TITLE
AUTHOR
Lukas Vileikis
TAGS
Performance
Indexes
TITLE
AUTHOR
Lukas Vileikis
TAGS
MySQL
Subquery
TITLE
AUTHOR
Lukas Vileikis
TAGS
Performance
TITLE
AUTHOR
Lukas Vileikis
TAGS
ACID
TITLE
AUTHOR
Lukas Vileikis
TAGS
ALTER TABLE
TITLE
AUTHOR
TheTable
TAGS
MySQL
Ports
TITLE
AUTHOR
Lukas Vileikis
TAGS
MySQL
Security
TITLE
AUTHOR
Lukas Vileikis
TAGS
ACID
MySQL
Security
TITLE
AUTHOR
Antonello Zanini
TAGS
BLOB
TITLE
AUTHOR
TheTable
TAGS
Foreign Key
PostgreSQL
TITLE
AUTHOR
Leslie S. Gyamfi
TAGS
PostgreSQL
Concurrency
TITLE
AUTHOR
Lukas Vileikis
TAGS
Security
Encryption
TITLE
AUTHOR
Lukas Vileikis
TAGS
Security
TITLE
AUTHOR
Bonnie
TAGS
Security
PostgreSQL
TITLE
AUTHOR
Antonello Zanini
TAGS
Subquery
TITLE
AUTHOR
Antonello Zanini
TAGS
Transactions
TITLE
AUTHOR
Ochuko Onojakpor
TAGS
Data structures
TITLE
AUTHOR
Antonello Zanini
TAGS
MySQL
TITLE
AUTHOR
Lukas Vileikis
TAGS
SSH
TITLE
AUTHOR
Antonello Zanini
TAGS
Stored procedure
MySQL
TITLE
AUTHOR
Antonello Zanini
TAGS
Triggers
TITLE
AUTHOR
Igor Bobriakov
TAGS
Microsoft SQL Server
Optimization
TITLE
AUTHOR
Bonnie
TAGS
PostreSQL
TITLE
AUTHOR
Ochuko Onojakpor
TAGS
Reusable queries
TITLE
AUTHOR
Antonello Zanini
TAGS
BIG Data
TITLE
AUTHOR
Lukas Vileikis
TAGS
MySQL
Security
TITLE
AUTHOR
TheTable
TAGS
Beginner
SQL
TITLE
AUTHOR
Ochuko Onojakpor
TAGS
CRUD
SQL Transactions
TITLE
AUTHOR
Lukas Vileikis
TAGS
MySQL
Security
TITLE
AUTHOR
Antonello Zanini
TAGS
PostgreSQL
JSON
TITLE
AUTHOR
Lukas Vileikis
TAGS
MySQL
InnoDB
ibdata1
TITLE
AUTHOR
Lukas Vileikis
TAGS
MySQL
TITLE
AUTHOR
Scott A. Adams
TAGS
Filter
TITLE
AUTHOR
Scott A. Adams
TAGS
SQLite
TITLE
AUTHOR
Scott A. Adams
TAGS
Excel
Export
TITLE
AUTHOR
Scott A. Adams
TAGS
ERD
Join

The content provided on dbvis.com/thetable, including but not limited to code and examples, is intended for educational and informational purposes only. We do not make any warranties or representations of any kind. Read more here.