The Ultimate Guide to Generated Columns
SQL generated columns enable you to store automatically computed data in a table. That means saving time and avoiding performing boilerplate INSERT or UPDATE queries again and again. Generated columns represent a useful tool in several scenarios, and this is why all major RDBMS technologies support them.
Let’s now find out more about generated columns in SQL.
What is a SQL Generated Column?
A SQL generated column is a type of column that stores values calculated from an expression applied to data in other columns of the same table. The value of a generated column cannot be altered manually and is automatically updated whenever the data it depends on changes.
A generated column is similar to a standard column. The main difference is that its value is closely related to the other columns of the same table. So, a generated column operates within the context of its own table.
You can define generated columns only through CREATE TABLE or ALTER TABLE queries. The syntax to define a generated column varies depending on the specific DBMS. However, it is generally similar to the SQL syntax below:
table_name is the name of the table to add the SQL generated column to. The AS (expression) clause specifies the SQL expression that the DBMS will use to compute the column values. As mentioned earlier, expression can only involve columns of the current table. Thus, JOIN clauses are not allowed. Also, expression only supports immutable SQL functions, which are types of functions that always return the same result when called with the same input arguments.
An example of a query to add a generated column to a table in DbVisualizer
↓
↑ An example of a query to add a generated column to a table in DbVisualizer
As you can see, you can also specify a generated column as VIRTUAL or STORED. This represents the type of generated column. Typically, most RDBMSs create VIRTUAL generated columns when the type is omitted.
Let’s now learn the difference between the two generated column types.
Virtual Generated Columns vs. Stored Generated Columns
There are two types of generated columns in SQL: VIRTUAL and STORED. The same table can have generated columns of both types, but the behavior related to their value generation will be different.
Let’s understand what are the differences between these two types.
Virtual generated columns
A generated column marked as VIRTUAL does not occupy storage space. This is because its value is not stored in the database, but evaluated each time at the time query is run. So, the DBMS evaluates the expression to generate the value of a VIRTUAL generate column every time it is accessed, rather than storing the value directly in the database.
Pros
- They do not take up disk space.
- INSERT and UPDATE queries do not come with additional overhead, as the DBMS does not need to generate their values.
- They are created instantaneously because the DBMS only has to update the table metadata, without having to generate their values for each row.
Cons
- All SELECT queries involving them become slower due to the need to evaluate their value on-the-fly.
Stored generated columns
A generated column marked as STORED takes up storage space as if it was a regular column. The DBMS generates its value and stores it on disk every time a row is inserted or updated.
Pros
- They can be accessed as if they were regular columns, which means there is no overhead on SELECT queries involving them.
Cons
- They take up disk space.
- INSERT and UPDATE queries require additional overhead, since the DBMS has to generate their values.
- When they are added to a table, the DBMS has to calculate their value for each row and rebuild the entire table.
Benefits of Generated Columns
Adopting SQL generated columns can bring several benefits, but the following are considered the most important ones.
Improved performance query
Defining generated columns that contain valuable information in a useful format allows for more efficient data retrieval. Also, STORED generated columns eliminate the need to add complex calculation logic at runtime, improving application performance as a consequence.
Increased data consistency
The value of a generated column changes whenever its underlying data changes. Specifically, the DBMS automatically takes care of keeping the values of generated columns consistent with their expression definition. This increases the consistency of the data stored in the database, reducing the risk of inconsistencies caused by manual updates.
Simplified queries and data management
Instead of making your queries complex, you can spread the complexity over some generated columns. Then, you can use them in simple filter operations. Generated columns also simplify data management by making values that are typically aggregated, such as name and surname, easier to access.
If you are looking for an easy and powerful SQL client and database manager, then you've got to try DbVisualizer. It connects to nearly any database.
Challenges of Generated Columns
Generated columns are a powerful database tool, but also come with some downsides you should take into account. Let’s take a look at the three most important ones:
- Increased storage space: STORED generated columns increase the size of the database table as the DBMS has to store their computed value on disk. This can lead to higher storage costs.
- Performance overhead: Evaluating the expression to calculate the value of a generated column takes time and represents a performance overhead. This is especially true if the expression is complex or if the column has to be updated frequently.
- Update restrictions: Not being able to update the value of a generated column is good for date consistency, but it also represents a limitation when it comes to dealing with exceptions.
SQL Generated Column Example
To better understand how generated columns work and how to use them, let's look at an example. The queries below will be based on MySQL, but keep in mind that the example is easily extendable to any other DBMS technology.
Assume you have a users table containing the following data:
Exploring the users table in DbVisualizer
↓
↑ Exploring the users table in DbVisualizer
As you can see, the table contains the name and surname columns. As it generally happens, these two pieces of information will be shown together. So, it may be useful to define a generated fullName column with the query below:
Launching the ALTER TABLE query to add the fullName generated column in DbVisualizer
↓
↑ Launching the ALTER TABLE query to add the fullName generated column in DbVisualizer
Chances are that fullName will be used in many SELECT queries, so you should define it as STORED to avoid introducing overheads.
Let us now perform a SELECT on the users table:
Note the values in the fullName column
↓
↑ Note the values in the fullName column
Each row now also has a generated column fullName.
Besides being useful in visualization, the generated column fullName also makes it easier to search for users. Suppose you have a search input to filter users. You can now write a query to retrieve users based on the string typed in the input as below:
Use it as in the example below:
fullName in action.
↓
↑ fullName in action.
Filtering users by full name has never been easier!
Also, it may be useful to add a column that in addition to the full name also contains the user` points. Since this column is very specific and will only be used in special circumstances, such as generating a leaderboard, it can be defined as VIRTUAL as follows:
Launching the ALTER TABLE query to add the fullNamePoints generated column in DbVisualizer
↓
↑ Launching the ALTER TABLE query to add the fullNamePoints generated column in DbVisualizer
Note that the definition expression of a generated column can depend on another generated column.
Now, you can use this new table to easily get the ranking of the best players as follows:
Getting the ranking of users in DbVisualizer
↓
↑ Getting the ranking of users in DbVisualizer
Et voilà! You just saw generated columns in action in a real-world example!
Conclusion
In this article, you learned what SQL generated columns are and how to use them. Generated columns in SQL represent without a doubt a great tool that can serve several uses and bring many benefits. As shown here in a real-world example, adopting them is easy and effective. For this reason, being able to take advantage of generated columns become critical to making your queries easier, cleaner, and faster. This is especially true if you rely on an advanced database client with query optimization support and many other features, such as DbVisualizer. Try DbVisualizer for free!
FAQs About Generated Columns
What databases support generated columns?
Generated columns are supported by many popular relational databases, including MySQL, MariaDB, PostgreSQL, SQL Server, and Oracle. The syntax for defining generated columns changes from language to language, but the concept remains the same.
What is the difference between a trigger and a generated column?
Generated columns can only be used to automatically insert or update data within a table through immutable SQL functions. On the other hand, triggers launch SQL scripts in response to a particular database event and can involve several tables. Thus, a trigger is a way to perform additional actions when something specific happens in the database, while a generated column is a way to automatically store calculated data in a table.
What are the types of columns generated in SQL?
In SQL, there are two types of generated columns: stored and virtual. Stored generated columns are precomputed and stored in the database, whereas virtual generated columns are calculated on-the-fly during query execution and do not take up storage space.
What is the difference between a generated column and a regular column?
A generated column is different from a regular column because its value is calculated automatically based on an SQL expression and cannot be manually changed. On the other hand, in a regular column, the value can be entered and altered manually as desired.
Can generated columns be used in indexes?
Yes, generated columns can be used in indexes just like regular columns. This opens up to several scenarios and is particularly useful to improve the performance of filtering or sorting queries involving their values.