Introduction to Indexes - Part II

on August 01, 2010

     

How SQL uses indexes

If a table does not have index, the only way to find all occurrences of a value within a table is to read the entire table. If a table has an index, it speeds up the locating of values within that index in two ways.
1.      The index is sorted in the order of the key columns. This means that once all the matching values have been found, the remaining portion of the table can be ignored. This is the same as a telephone directory, where once all entries with a particular surname have been found, the rest of the book can be ignored as no further matches are possible
2.      The tree structure of the index allows a divide-and-conquer approach to locating rows, where large portions of the table can be quickly excluded from the search. This is illustrated in Figure 3
There are four basic operations that SQL can do on an index. It can scan the index, it can seek on the index, it can do lookups to the index and it can update the index

Scans

An index scan is a complete read of all of the leaf pages in the index. When an index scan is done on the clustered index, it’s a table scan in all but name.
When an index scan is done by the query processor, it is always a full read of all of the leaf pages in the index, regardless of whether all of the rows are returned. It is never a partial scan.
A scan does not only involve reading the leaf levels of the index, the higher level pages are also read as part of the index scan.

Seeks

An index seek is an operation where SQL uses the b-tree structure to locate either a specific value or the beginning of a range of value. For an index seek to be possible, there must be a SARGable3 predicate specified in the query and a matching (or partially matching) index. A matching index is one where the query predicate used a left-based subset of the index columns. This will be examined in much greater detail in a part 3 of this series.
The seek operation is evaluated starting at the root page. Using the rows in the root page, the query processor will locate which page in the next lower level of the index contains the 1st row that is being searched for. It will then read that page. If that is the leaf level of the index, the seek ends there. If it is not the leaf then the query processor again identifies which page in the next lower level contains the specified value. This process continues until the leaf level is reached.
Once the query processor has located the leaf page containing either the specified key value or the beginning of the specified range of key values then it reads along the leaf pages until all rows that match the predicate have been returned. Figure 2 shows how a seek would be done on an index when searching for the value 4.

 f the index contains all the columns that the query needs, then the index is said to be covering for that query. If the index does not contain all the columns then SQL will do a lookup to the base table to fetch the other columns in order to process the query.
(3)   SARGable is a made-up word, constructed from the phrase Search ARGument. It refers to a predicate that is of a form that SQL can use for an index seek.

Lookups

Lookups occur when SQL uses an index to locate rows affected by a query but that index does not contain all the columns required to satisfy the query, that is the index is not covering for that query. To fetch the remaining columns, SQL does a lookup to either the clustered index or heap.
A lookup to a clustered index is essentially a single row clustered index seek and it is always a single row seek. So if a lookup is needed for 500 rows, that involves 500 individual clustered index seeks.

Updates

Anytime that a row is changed, those changes must be made not only in the base table (clustered index or heap) but also in any index that contains the columns that were affected by the change.  This applies to insert, update and delete operations.

Considerations for creating indexes

I’ll be going into more detail on considerations for indexes in the next two parts, but in general:
  • Clustered index should be narrow, because the clustering key is part of all non-clustered indexes.
  • Composite non-clustered indexes are generally more useful than single column indexes, unless all queries against the table filter on one column at a time.
  • Indexes should be no wider than they have to be. Too many columns wastes space and increases the amount of places that data must be changed when an insert/update/delete occurs.
  • If an index is unique, specify that it is unique. The optimizer can sometimes use that information to generate more optimal execution plans.
  • Be careful of creating lots of indexes on frequently modified tables as it can slow down data modifications.
 In the next part, we will see more detail into clustered indexes. What they are, how they differ from non-clustered indexes and what the considerations are for creating a clustered index.

Related Posts:

Introduction to Indexes - Part I
Introduction to Indexes - Part II
Introduction to Indexes - Part III
Introduction to Indexes - Part IV
Introduction to Indexes - Part V
Introduction to Indexes - Part VI
     

0 comments:

Post a Comment