10x Query Performance with a Database Index
Ever since open source databases have been a thing, indexes have been that something that improves their performance. In different database management systems, different indexes have different usage types – some work together with certain types of indexes, some cover queries only partly, but all of them one thing in common – they speed up SELECT operations.
Types of Database Indexes
As you may know, database indexes have a couple of types and each of them accomplish different purposes. These are the types of indexes available in MySQL alone (for the full list of possible index types, refer to the documentation):
- B-Tree indexes – these are the indexes that are used together with search queries that search for exact matches of data.
- R-Tree – or spatial – indexes can be used to index geospatial (geographical) data.
- Covering indexes that cover all of the columns that a query is using allowing a database to read the index instead of the data itself which is a faster process.
- Partial indexes that cover a part of the column that a specific query is using (such indexes are frequently used to save space on the disk.)
- Fulltext indexes that can be used to help developers run fulltext search operations on data sets.
Indexes speed up reading (SELECT) operations, but slow down all of the other operations (INSERT, DELETE, and UPDATE queries) in return.
When and How to Index?
The most frequent question regarding indexes is the time that’s best to index data. And the answer to this question is pretty straightforward, really – you should index data as soon as you feel your SELECT queries grinding to a halt or as soon as you have a lot more data than usual (100,000 rows and above is a good starting point.)
To index or not to index also heavily depends on your disk – indexes take up space, so if you’re indexing millions of rows, expect the indexing process to take a toll on the disk space and to take longer than usual.
Take a look at this table:
Information about a table in DbVisualizer
->
↑ Information about a table in DbVisualizer
In this case, around 200,000 rows would be the perfect time to think about indexing – the data has space to grow, but at the same time the row count isn’t insignificant.
In most cases to add an index, you will make use of the CREATE INDEX or the ALTER TABLE query – they aren’t symlinks, but whether you run a query like CREATE INDEX idx_name(column_name) or ALTER TABLE demo_table ADD INDEX idx_name(column_name); shouldn’t make much of a difference, just bear in mind that ALTER TABLE queries work a little differently than ADD INDEX queries do: if your table is bigger, they make a copy of the table on the disk, copy all of the data to it, make the requested changes, then swap the original table with the table that the DBMS created – that will take up time as well.
If you are looking for an easy and powerful SQL client and database manager, then you've got to try DbVisualizer. It connects to nearly any database.
Anyway, add indexes with the ALTER or CREATE INDEX queries – if you add them using DbVisualizer, it will also tell you the time that the queries take and the status of them (if the status is FAILED, go back and check on what went wrong, then run the query again):
Successful and Failed Queries in DbVisualizer
↓
↑ Successful and Failed Queries in DbVisualizer
What Type of Index to Choose?
After you’ve decided what and when you will index, it’s time to decide what type of an index you should choose for your specific use case. In many cases, you will most likely be using B-Tree indexes: they are the most frequently used type of an index.
In other cases, though, your decision may be influenced by:
- The space available on the disk – if you’re low on disk space, use prefix indexes. They will let you index a prefix of the column.
- The type of data that your database holds – if you’re holding geographical data, use spatial indexes.
- The specific scenario that you want your databases to adhere to:
- If you want to use fancy queries with wildcard support, searching in the boolean mode, etc. you would need to use full-text indexes.
- If you want to save disk space, you will use prefix indexes.
- If you’re working with the types of storage engines that store data in the memory and want to quickly read the data, consider using hash indexes.
- If you’re working with geographical data, you may need to use spatial – R-Tree – indexes.
Different types of indexes are defined in different ways – some of them can be seen below, but for most of us, ordinary – B-Tree – indexes will do the trick.
- Prefix indexes are defined by defining a prefix like so (the number defines the characters to index):
Prefix Indexes in MySQL
↓
↑ Prefix Indexes in MySQL
- Fulltext indexes are defined by specifying a FULLTEXT type:
Fulltext indexes in MySQL
↓
↑ Fulltext indexes in MySQL
Columns can also have more than one type of index on them at once to satisfy multiple types of queries at the same time.
DbVisualizer and Indexes
Once your indexes have been added, go back into the overview of your table (click on your database management system, then expand the database in question, then head over to the Tables.) Finally, click on the Indexes tab below the table itself to get ahold of all of the details:
Observing indexes in DbVisualizer
↓
↑ Observing indexes in DbVisualizer
Not only will DbVisualizer tell you what table holds what kind of an index, it will tell you many more details related to the index itself including:
- The name of the index and the column it’s on
- Whether the index is made in an ascending or descending order
- All about the index cardinality (how many unique values does the index hold)
Basically, you will be able to observe everything that’s necessary about the index itself – most of the details presented in this tab will be necessary for those who know a little bit about databases themselves or those who are solving issues relating to query performance when indexes are in use.
Query Performance and Indexes
As we’ve already noted, indexes are only necessary to improve the performance of reading data – they improve the performance of SELECT queries. How many times the query performance will be improved, though, is directly dependant on your queries themselves – to maximize the performance of your queries when they’re using indexes, aim to make your queries as simple as possible, that is:
- Return as few rows as possible – switch SELECT * to SELECT column.
- Avoid JOINing data where possible.
- Avoid OR queries where possible – OR means “also think about…” which slows down performance.
- Subqueries also slow queries down due to the resources, so be vary of that. If necessary, feel free to use them, but when using them, make sure that the subquery is using indexes as well.
Finally, bear in mind that not all queries will use indexes you’ve defined in the first place – make use of the EXPLAIN keyword in database management systems to guide you to the light: as a rule of thumb, define the column that is indexed straight after the WHERE part of the query to ensure that your query is actually using the index and you should be good to go.
Also bear in mind that indexes will slow down INSERTs, UPDATEs, and DELETEs due to the fact that data needs to be updated together with the index when it’s in use: if you’re using an index, you should allocate more time for such operations as well.
Finally, keep in mind that database maintenance has an impact on your query performance as well – we recommend using SQL clients such as DbVisualizer to gain a deeper insight into your queries and your database instance itself: your database will thank you after a while.
Summary
In this blog, we’ve walked you through the importance of using indexes in database management systems. Most of the examples provided were centered around MySQL and its counterparts such as Percona Server and MariaDB, but the concept remains the same – indexes speed up SELECT operations at the same time slowing down UPDATEs, DELETEs, and INSERTs.
Indexes have multiple types each being able to solve a different problem, so before using them, dig into these as well, do know the risks involved, and you should be good to go.
When improving query performance with indexes, keep in mind that SQL clients like DbVisualizer can also help you solve problems related to maintenance and upkeep of your databases – grab a free trial and try it today, come back to the blog to learn more about databases in a while, and until next time.