Do you know Index Statistics in SQL Server?

Friday, June 24, 2011 |


This concept is based on Index so if you want to clear up your basics about different type of index available in SQL Server, do lookup here.

In order to understand the concept of “Index Statistics”, you must be aware with the concept of “Query Selectivity”. 

If you create an Index on any field in your table and execute “SELECT” query on the same table with the field in “where” clause which you have defined as “Index Key Column”, it is not guarantee that the index you have defined, would be used. 

Question might pops up in your mind that whether SQL Server making any mistakes by not using my Index and scanning all rows? 

No, SQL Server is not making any mistake in this case but it is being smart by choosing the proper, right and efficient way to execute your “SELECT” query.


If your conditions lied in “WHERE” clause, “Having” clause and “JOIN” statements returns almost all rows or majority of rows than optimizer wouldn’t waste the time to call Index, find the desired value from Index and return actual row from table. Rather than doing this, SQL Server would directly scan complete table and return desired output, if optimizer thinks that table scan would take less time than Index call. This is the concept of “Query Selectivity”.

After reading above paragraph, one more question might pops up in your mind again that without even executing query, how do SQL Server take decision to go for Index or not? How do SQL Server know that whether most of the rows would be return by query?

Answer to this question lies under “Index Statistics”.  When you create an index, not only data get sorted and stored in leaf level pages of Index with Pointer or actual data in case of clustered index but it creates “Histogram” too. This histogram tells optimizer that how many rows would be returned by given condition in “SELECT” query. SQL Server would take decision based on the answer of Histogram whether to call Index or not.

We have created one clustered index in one of my previous article here. I am going to use same index, named “idx_refno”, to demonstrate “Histogram”.

--we had orders table in SQLHub data and Index named "idx_refno"
--use the same here
DBCC SHOW_STATISTICS ('Orders','idx_refno')

After executing above query, it will return three tables as a result of that query. Look at the below screen shot.

 
1st table would return some important information about Index like Index name, last update date of index etc.

2nd table would return information about Index Key column, its density and length of key column.

3rd table would return information about “Histogram”, SQL Server uses this to decide how many rows would be return based on your “SELECT” query.

For example: you can see last row of third table which has “2057058364” in “Range_HI_KEY” column and “900” in “EQ_Rows” column. It means that, if you pass “2057058364” to your “RefNo” column, you will get “900” rows.  Try it out.

--you might have different values in your refno column
--so confirm from your histogram table and change it in WHERE clause below
--before executing this query

SELECT * from orders where refno=2057058364

See the screen shot below:


 
So, do you know Index Statistics in SQL Server, now?


if you want to refer all other articles related to index, click here.

Reference: Ritesh Shah
http://www.sqlhub.com
Note: Microsoft Books online is a default reference of all articles but examples and explanations prepared by Ritesh Shah, founder of
http://www.SQLHub.com

Ask me any SQL Server related question at my “ASK Profile

0 comments: