Understanding SQL Index
Understanding SQL Index
Let look at the following queries:
1 | SELECT COUNT(*) FROM customer WHERE first_name = 'Nora'; |
Each query above had to scan all the rows to find matching data, therefore the time complexity is O(n)
. To improve search performance, we can use indexes.
An index makes search faster by creating a Blance Tree structure(B-Tree index
, the time efficiency is O(logn)
) or a Hash table(hash index
, the time efficiency is O(1)
). Note that the hash index can only be used for equality, i.e., WHERE A = ...
rather than WHERE A > ...
.
Normally, attributes that are likely to turn up often in conditions should be indexed. Many database management systems automatically build indexes on primary keys, and some DBMS build indexes on attributes that are unique(with unique constraint).
Indexes also implicitly provide the indexed attribute in sorted order. This will ensure that we can “merge” attributes. For example, look at the query below:
1 | SELECT * FROM course JOIN student ON (course.id = student.course_id); |
For the table course
, the primary key id
is indexed by default. And We also build an index on course_id
in table student
. Assume the indexed attributes are not sorted, when we join these two tables, it would look like this:
By sorting indexed attributes, a JOIN
can be more efficent:
There are also some weak points of index:
To create indexes, a database had to scan all the rows, which would cost a lot of time.
Indexes in a database requires storage, so there is a trade-off between time and space.
When you add data to a database, it needs to create new records and update all relevant indexes. Therefore, although indexes help to speed up SELECT queries and WHERE clauses, they slow down data input, with the UPDATE and the INSERT statements.