Improving performance with indexes

Creating a database and managing it, if you have about 100 users, isn’t terribly difficult. The trouble comes when the number of users that you have grows exponentially. Suddenly, you realise there needs to be little room for potential error in the design and more importantly you realise the importance of space. Every little counts! Finally, your website’s performance is of paramount priority. If you’re drawing on a ‘student’ table (like we were in one of our applications) and it’s taking 10 seconds instead of 1….something needs to change.

Using indexes can usually help. But they can be incredibly harmful when used incorrectly. (I’ll explain)

Why are we interested in indexing? It’s an easy way of improving performance of your database.

What is an index?

An index is a structure that exists independent of the data in your table that improves the query optimiser’s ability to execute your query quickly.

An analogy- think of the index that you find at the back of a textbook. The analogy is not perfect, but it aids understanding. Say you needed to find every instance of the word ‘moose’ in a very large book about mooses.

You could start at word 1, line 1, page 1 and go through the entire book finding all the instances. Or, you could look up the index, if it exists. You would go to the back of the book and in the index, go to the letter ‘M’, and then find the word. In this magical place, you would be shown every instance of the word ‘moose’ as well as a direct pointer to where the word can be found in every case. It’s efficient and saves you a load of time! Now, imagine if the author decided to add a couple of pages to the middle of the book. What happens to all the index values? That’s right – they’d have to change too. Hopefully this simple analogy helps you see both the advantages and the potential difficulties that can arise with the use of indexes.

The execution in databases is different but the concept is similar.

So, how does indexing work?

Remember an index is just a pointer to the data in a table. It often has to do with ordering by a particular field. This allows more efficient ‘searching (eliminating the need to scan). You could also think of an index as something that represents the book (in our example) in keyword or subject order. The index will basically limit the number of records that we have to look at to satisfy our query. If you have a student table with student information, you could create an index (a separate structure) ordered by ID. Each ID corresponds to data in the student table (a pointer defines this)

The problem however, is that adding or deleting data will affect the index!

Indexes therefore will drastically speed up some queries but can also slow down the insertion and deletion of data. So, generally data retrieval is improved but data modification can be made slower and less efficient.

The three main types of indexes:

Single column, Composite (more than one column is used for the index e.g. lastname and fistname) and Unique (doesn’t allow duplicate values).

So when should you create an index?

Only create an index on columns that are frequently used in the WHERE clause.
Only create an index on columns that contain a high number of unique values

When should you avoid indexes?

If you have a small table! (<8000 rows)
If you have column values are not highly unique (e.g. M/F, Y/N etc). There’s no point if half of the values are just ‘M’ and the other are ‘F’.
If you have columns that are being frequently updated.
If you have large composite indexes.

Sql server has a tool called the ‘index tuning wizard‘ which can be very useful in the building of indexes.

Improving performance with indexes

Published by Learn Django

Leave a comment Cancel reply

Share this:

Related

Published by Learn Django

Leave a comment Cancel reply