PostgreSQL CTE: What It Is and How to Use It
To extract the data of interest from a PostgreSQL database, you may need to run several queries. This means executing them one at a time at the application level, resulting in high overhead. SQL subqueries can help, but they have some major limitations in terms of reusability. This is where a WITH query with the right CTEs can save you!
In this article, you will look at the definition of a PostgreSQL CTE, how to use WITH queries, and some examples of CTEs.
What Is a PostgreSQL CTE?
A PostgreSQL CTE (Common Table Expression) is a temporary result set that can be referenced within another SQL query. It allows users to create named subqueries that can be used as tables within SELECT, INSERT, UPDATE, or DELETE queries. In other words, it is a mechanism to provide more flexibility and readability to complex queries.
How to Write a Common Table Expression in PostgreSQL
In PostgreSQL, a CTE is defined through the WITH clause. Each auxiliary CTE statement consists of two elements:
- A name
- A query
Here is an example of the syntax to define a PostgreSQL common table expression:
cte_name is the name given to the CTE, which is defined by the query inside the parentheses. The subsequent SELECT statement uses the CTE as if it was a table, allowing further operations to be executed on that data. This is possible because the result set produced by the CTE can be referenced in the main query by its name as any other table or view.
The same WITH clause can involve several CTEs, each with its name and specification query:
The external SELECT can use data from any CTE result set in its clauses. For example, it can employ such data in WHERE conditions to apply special filters.
Note that the queries above are just examples. Each CTE can be a SELECT, INSERT, UPDATE, or DELETE. Similarly, the main query itself can be a SELECT, INSERT, UPDATE, or DELETE.
Pros and Cons of WITH Queries
Let’s take a look at the most important benefits and drawbacks of WITH queries in PostgreSQL.
Pros
- Improved Readability: Thanks to the WITH statement, you can break down complex queries into smaller parts. This enhances query readability and makes it easier to understand and maintain SQL queries.
- Enhanced Reusability: The same CTE can be reused multiple times within the same query, eliminating the need to repeat the same sub-query logic in the main query.
- Recursive Capabilities: PostgreSQL supports recursive CTEs, enabling users to accomplish tasks not otherwise possible in standard SQL. That is useful for handling recursive data or operations, such as traversing tree-like structures.
Cons
- Optimization issues: WITH queries are usually trickier to optimize compared to equivalent queries without CTEs. To optimize them, it is essential to carefully analyzing the execution plan.
- Limited Scope: CTEs only exist within the context of the WITH clause in which they are defined. They cannot be referenced outside that query.
- Memory Usage: WITH queries may consume more resources compared to alternative query structures. This is especially true when dealing with executing multiple CTEs simultaneously on large datasets.
PostgreSQL CTE: Examples
Let's explore three SQL examples of common table expressions in PostgreSQL to understand how they can come in handy in real-world scenarios.
Example 1: Getting an Organization’s Hierarchy
Assume your employees table has the following columns: id, complete_name, and manager_id. You want to retrieve the organizational hierarchy under a specific employee.
Here is how you can get that data using a recursive CTE:
In this example, the CTE employee_hierarchy selects the first employee based on their ID. Then it recursively joins the partial result set with the employees table to retrieve the employees managed by the current manager, continuing traversing up the hierarchy until the entire sub-hierarchy is fetched. This is a common query pattern to traverse a tree in PostgreSQL.
Executing the WITH query in DbVisualizer.
↓
↑ Executing the WITH query in DbVisualizer.
Note that you could not achieve this result with a simple SELECT. This is because RECURSIVE makes the query repeatedly execute and combine the result of the CTE. In detail, the query will loop over the CTE, applying the query logic again and again until there are no more elements to iterate over. You cannot do that with a SELECT.
Example 2: Getting the Department with the Highest Average Salary
Assume you have an employees table with columns like id, department_id, and salary. You want to find out what department has the highest average salary.
You can obtain that info with a WITH clause as below:
In this example, the CTE department_avg_salary calculates the average salary for each department. The main query then displays the department that earns the most.
The "Finance" department is the one that makes the most.
↓
↑ The "Finance" department is the one that makes the most.
In this case, you could get that info also with a complex SELECT query.
Example 3: Retrieving per-product sales in top regions
Now you want to get sales totals by product only in the most important sales regions. Use a WITH clause involving two auxiliary sub-queries as follows:
The output of the first CTE is used in the second to produce the top_regions result set. Then the output of top_regions gets employed in the WHERE condition of the primary SELECT query to get the desired data.
Writing the query multiple CTEs in DbVisualizer.
↓
↑ Writing the query multiple CTEs in DbVisualizer.
Conclusion
In this article, you saw that PostgreSQL's CTEs (Common Table Expressions) provide a powerful tool for breaking down complex queries, improving code organization and facilitating code reuse. In particular, you got to understand the syntax of the WITH query and the benefits of leveraging CTEs.
The main problem with this feature is that it can lead to non-optimal, slow, resource-intensive queries. Here is where DbVisualizer comes into play! In addition to the most common features of a database client and support for dozens of DBMSs, this tool offers advanced query optimization capabilities that will help you take your PostgreSQL CTEs to the next level. Download DbVisualizer for free now!
FAQ
Let’s answer some interesting questions.
How does a CTE in PostgreSQL differ from a regular subquery?
They both enable you to structure complex queries. However, CTEs generate named temporary result sets that can be referenced multiple times within a query. Subqueries, on the other hand, are embedded within a larger query and cannot be referenced separately since they do not have a name.
Can multiple CTEs be nested within the same PostgreSQL query?
Yes, multiple CTEs can be nested within the same PostgreSQL query. WITH supports the definition of several common table expressions, which can then be referenced once or multiple times in the main query.
Is it possible to update or delete data using CTEs in PostgreSQL?
Yes, WITH queries supports UPDATE or DELETE statements.
Can a CTE be referenced multiple times in the same query?
Yes, a CTE can be referenced multiple times within the same PostgreSQL query. You can use it both in other CTEs and in the main query.
Are PostgreSQL CTEs only limited to SELECT statements?
No, PostgreSQL CTEs can be used with other DML (Data Manipulation Language) statements. In addition to SELECT, they also support INSERT, UPDATE, and DELETE. This makes them a flexible tool to perform complex operations on existing or new data.