One of the most common misconceptions about SQL is that the more indexes you have, the better. It is said that if you want to speed up your SELECT queries, the easiest way is to create an index on the columns used in, for example, the WHERE statement.
This implies that the more indexes, the faster your SELECT queries will execute. Few, if any, ever mention that your indexes also impact your other queries; INSERT, UPDATE and DELETE. The reason behind this is quite obvious when you stop to think about it.
When doing any kind of data change, having indexes on your table changes the amount of time needed to execute your query.
Let’s demonstrate this using the famous AdventureWorks database with some enlarged tables as a source.
Copying over all the data (4 852 680 rows) from one table to another table, without any indexes on the destination table, results in 15 seconds to run the INSERT query.
Now, after truncating the table and creating 1 index, run the exact same INSERT query results in 29 seconds.
If you create 2 more indexes and re-run the INSERT query again, the resulting time more than doubled to 68 seconds.
It can be concluded therefore that creating even more indexes causes any query changing data to have a higher execution time than without indexes.
It’s not exact mathematics, but every single index has approximately the same impact on the execution time as inserting the data without any indexes.
This should make it very clear that next time a table is created in a database, thinking about whether or not an index should be created, how many indexes to create and on which columns should become more important, given the impact on execution time for inserting, updating and deleting data.
Now, next to changing data there’s also retrieving data from your tables, which indexes have an impact on as well.
Which brings me to another misconception: that creating indexes on your table will always speed up the execution time of retrieving your data. This is not always the case, depending on the index(es) you have on your table, the columns on which columns you have an index and how much data you’re trying to retrieve.
There are multiple aspects to retrieving data impacted by indexes. Indexes can reduce the execution time, but when the execution time seems to stay the same, there may be improvements elsewhere, such as reducing the logical reads.
To see the number of logical reads, you need to run the following query, and after running any query, you can view the statistics on the “Messages” tab.
Now to test this out, I’ve made 2 queries to run. The first one is the following:
When running that query without any index, the execution time is 21 seconds to retrieve three 100 040 rows.
The statistics show the following details: “Scan count 18, logical reads 98638”.
When running the same statement after creating an appropriate index, the execution time is still 21 seconds, but the statistics now show: “Scan count 74, logical reads 86068”.
The second query is a much simpler query returning only 187 520 rows:
In comparison, when running that query without an index, the execution time is 2 seconds and the statistics show “Scan count 9, logical reads 49319”.
Now, with an index the same query results in the same 2 second execution time but with these statistics: “Scan count 1, logical reads 1775”.
From this, a conclusion can be made that creating the right index to retrieve records doesn’t necessarily reduce the execution time, but may reduce the amount the logical reads.
To explain what a logical read is, this is the note from Microsoft:
“A logical read occurs every time the Database Engine requests a page from the buffer cache. If the page is not currently in the buffer cache, a physical read first copies the page from disk into the cache.”
Each logical read is counted in a number of pages, where each page is around 8 kilobytes.
In the last example of 1775 logical reads, almost 14 MB (1775 * 8 / 1024) of data was the result. However, the second last one with 49319 logical reads amounts to 385 MB in data.
Depending on the workload, the (in)efficiency of the queries and how often these queries are executed, this can result in a huge burden on the database’s server and slow down the entire system.
To summarize, indexes have a positive impact but also a negative impact. Pay attention to whether or not you should create indexes, and if you should create more than one. Indexes impact all types of queries, and depending on your use case you may gain more by not having one.
Thanks for reading!
Subscribe to our RSS feed