An index in SQL is actually not a database key type, but keys and indexes are loosely related concepts, so we’ll cover them in this post.
To help you understand indexes, we’ll sail through an example of customer orders. This figure demonstrates our corresponding tables.
The customer table stores the master data of our customers, while the order table stores the orders they placed over the entire history of our company. So, the number of rows in the order table may get really big over time. This table shows some entries on the order table to give you a better idea.
Now, imagine an actual customer calling a sales representative and asking about the total price of an order they placed last week.
If the customer can provide their order number, the sales representative can spot the record immediately. Practically speaking, when the representative enters the order number into their screen and hits Search, their program will access order using order.id, which is its PK.
Now, let’s warm up to SQL a little. The code below contains the simple SQL statement to access an order table row by its PK.
SELECT
*
FROM
order
WHERE
id = 'ORD005';
The next table demonstrates the only result that would be returned from this statement. There can be only one result because a PK value may correspond to only one unique row.
Relational databases love PKs! Whenever you can provide them, they get back to you with blazing fast results in most cases. PK access is arguably the fastest way to find a row.
Now, there may also be cases in which the customer doesn’t remember the order ID but remembers the date on which they placed the order. This is very common in customer service, so practically speaking, when the representative enters the order date into their screen and hits Search, their program will access order by using order.order_date—which is not its PK.
In such cases, the poor database engine will have to scan through all the rows in the order table and return entries that have the provided order_date value. This listing contains a simple SQL statement to access an order table row by a date.
SELECT
*
FROM
order
WHERE
order_date = '2024-07-24';
If the order table contains only a handful of rows, this query may still run fast. However, in a table with millions of rows, such queries may take a very long time to complete and frustrate both customers and sales representatives!
That’s where indexes come in play. Our relational database gives us the option to create an index for order.order_date. Upon initial creation of the index, the database will automatically create a secondary data structure for order.order_date, which it will then use like a book index to find order rows quickly whenever the order table is searched by date. And whenever an order table row is updated or deleted, the database engine will modify the index accordingly.
Book Indexes: Just as their name suggests, database indexes work like book indexes. You can find certain topics in a book by using the index at the very end, instead of reading through the entire book. The same applies to database indexes—you use them to find corresponding rows instead of reading through the entire table.
In most cases, the speeds of queries with indexed columns are not comparable to those without indexed columns. Indexed columns will almost always execute much faster, and the thing is, you don’t even need to explicitly tell the database to make use of the index! When you execute the previous code, the database will find and use the order.order_date index automatically, returning rows very quickly.
Index Hints: Many database products allow programmers to provide hints on which index to use. If you don’t trust the database to pick the correct index, you can still make it use your index of choice by mentioning it in your SQL statement. But most of the time, databases make good decisions and use the appropriate indexes.
In our example, we went through an index containing a single field: order.order_date. However, it’s also possible to create composite indexes containing multiple fields, just like the composite PKs.
If indexes speed stuff up dramatically, then why don’t we create dozens of indexes for each database table, just in case? That approach doesn’t work that well because indexes come with compromises. Some of the most significant costs of using indexes are as follows:
- Disk space: Just like a book index occupies additional pages, a database index occupies additional disk space. That can sometimes be a problem on large tables.
- Performance overhead: Although indexes speed up some read operations, they slow down some other operations. Every time a record is created, changed, or deleted, the database must change the table itself and all indexes of the table—which has a performance cost during runtime.
- Maintenance effort: Over time, indexes can become fragmented and even degrade performance, so periodic maintenance like reindexing may be needed.
- Complexity: With many indexes present, the database engine may perform poorly because it can’t pick the most efficient index.
Therefore, it’s advisable to have a reasonable approach towards indexes. Some cases in which an index may not be desirable are as follows:
- Small tables, where read operations are fast anyway
- Tables with high write frequencies
- Columns with a low number of distinct values, such as boolean types
- Columns with long text or binary data
Building an accurate index plan requires some developer experience. When used appropriately, though, indexes speed up queries and improve users’ quality of life! So don’t be afraid of using indexes—just be mindful of not overusing them when they may not be needed in the first place. You don’t want to over-engineer things by providing solutions to nonexistent problems.
Editor’s note: This post has been adapted from a section of the book SQL: The Practical Guide by Kerem Koseoglu. Dr. Koseoglu is a seasoned software engineer, author, and educator with extensive experience in global software development projects. In addition to his expertise in ABAP, he is also proficient in database-driven development using Python and Swift. He is the author of Design Patterns in ABAP Objects (SAP PRESS), as well as several best-selling technical books in Turkey. He has a Ph.D. in organizational behavior.
This post was originally published 3/2025.
Comments