Anton Zhiyanov
Posted on June 13, 2023
A covering index is the fastest way to select data from a table.
Let's see how it works using a query that selects employees with a certain salary:
select id, name from employees
where salary = 90;
No index vs. Using an index
If there is no index, the database engine goes through the entire table (this is called a "full scan"):
QUERY PLAN
`--SCAN employees
Let's create an index by salary:
create index employees_idx
on employees (salary);
Now the database engine finds records by salary in the index (this is faster than going through the entire table). And for each record found, it accesses the table to get the id
and name
values:
QUERY PLAN
`--SEARCH employees USING INDEX employees_idx (salary=?)
Using a covering index
Let's create a covering index (which covers all selected columns):
create index employees_idx
on employees (salary, id, name);
Now the database engine works only with the index, without accessing the table at all. This is even faster:
QUERY PLAN
`--SEARCH employees USING COVERING INDEX employees_idx (salary=?)
However, simply covering all columns used in a query may not be enough. The order of the columns should allow for a fast search using the index.
Suppose we build an index with the same set of columns, but in a different order:
create index employees_idx
on employees (id, name, salary);
Now the database engine won't be able to quickly find records with salary
= 90
. It may still use the index, but it will be a full index scan instead of a search (which is slow).
QUERY PLAN
`--SCAN employees USING COVERING INDEX employees_idx
(note SCAN instead of SEARCH here)
Covering indexes cost more when the data in the table changes, so don't create them for every type of query. Often this is one of the last optimizations after everything else has been done.
Follow @ohmypy on Twitter to keep up with new posts
Posted on June 13, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 29, 2024