Introduction to Indexes - Part V

on August 02, 2010
   
In this article, we will take a closer look at nonclustered indexes, how SQL uses nonclustered indexes and some of the recommendations for selecting useful nonclustered indexes.

What is a nonclustered index?

A nonclustered index is the second type of index in SQL Server. They have the same b-tree structure as the clustered index (see previous article). Unlike the clustered index nonclustered indexes does not contain the entire data row at the leaf level. Rather, the nonclustered index contains just the columns defined in the index, and a pointer to the actual data row. See figure 1 for a high-level architecture. Assume that the nonclustered index depicted there is based off the clustered index depicted in the previous part.


When the underlying table is a heap (has no clustered index) then the pointer to the data row is the RID, an 8 byte structure comprised of the File ID, Page No and Slot index, i.e. the actual location of the row within the data file.
When the underlying table has a clustered index, the pointer to the actual data row is the clustering key. As mentioned in the previous article, this has important considerations for the selection of a clustering key.
There can be multiple non-clustered indexes on a table. In SQL 2005 and earlier, there was a limit of 249 non-clustered index per table. In SQL 2008, with the introduction of filtered indexes, the limit on indexes has been increased to 999.

Include columns

Columns specified as include columns are stored at the leaf level of the nonclustered index, but not at the intermediate or root levels. They are not part of the index key and do not count towards the 16 column/900 byte limit on indexes. Any data type other than the old LOB types (TEXT, NTEXT, IMAGE) are allowed for include columns, although columns defined as varbinary(MAX) Filestream are not permitted as include columns. The ability to specify include columns was added in SQL 2005.
Include columns are useful in that they are columns available in the index but not contributing to the size of the index key hence they make it easier to create covering indexes (see next section) than could be done without them.
Typically columns that appear only in the select clause of a query and not within the WHERE, FROM or GROUP BY are candidates for INCLUDE columns.
If LOB columns are specified as include columns, the entire contents of the LOB columns are duplicated. It’s not just a pointer to the existing LOB pages added to the nonclustered index.
The ability to specify include columns was added in SQL 2005.

Covering indexes

Covering is not a property of an index, it is not possible to say, without additional information that a query is covering or not covering. Instead, covering deals with an index and a query together.
An index is said to be covering for a specific query if that index contains within it all of the columns necessary for the query. The columns may be part of the key or they may be include columns. This means that a query that has a covering index can be evaluated completely from that index, without needing to go to the base table (heap or cluster) to fetch additional columns.
It has been said before that creating a covering index for a query is just about the best way to speed a query up. This is true. It is not always possible or desirable to cover every query. Covering every query may lead to unacceptably large indexes or unacceptably large numbers of indexes with all the attendant problems (as mentioned in part 1)
It is possible (and often desirable) for an index to be covering for more than one query.

Filtered indexes

Filtered indexes are a new feature in SQL 2008 and they allow for an index to contain only some of the rows in the table. Prior to this, an index would always have, at the leaf level, the same number of rows as the table. With filtered indexes, an index can be based on just a subset of the rows.
When creating a filtered index, a predicate is specified as part of the index creation statement. There are several limitations on this predicate. The comparison cannot reference a computed column, a column that is declared as a user-defined type, a spatial column or a hierarchy-id column.
A clustered index cannot be filtered as it is the actual table.

In the next article, we will see how a non-clustered index is used.

Related Posts:

Introduction to Indexes - Part I
Introduction to Indexes - Part II
Introduction to Indexes - Part III
Introduction to Indexes - Part IV
Introduction to Indexes - Part V
Introduction to Indexes - Part VI
   

0 comments:

Post a Comment