SQL-Server Blog of Ritesh Shah --Fight the fear of SQL with SQLHub.com: RANK()

Showing posts with label RANK(). Show all posts

Windowing Partition Functions (Rank(), Row_Number() and Dense_Rank())

Friday, October 2, 2009 |

Posted by Ritesh Shah

Microsoft SQL Server 2005 comes up with fabulous Partitioning functions which work well in SQL Server 2008 too. Just because of ignorance or unawareness of these functions, programmer used to iterate BIG BIG loops in front-end. Without much a boring lecture, let me start over functions now.

Row_Number(): Basically Row_Number() function is used to give number to each row in result set. It plays an important and a crucial role in many difficult situations. One can use it for paging purpose too (I probably will post one SP in future article which you can directly use in your front-end for paging purpose).

--create temp table for demo

IF OBJECT_ID('tempdb..#Employee') is not null DROP TABLE #EMPLOYEE

Create Table #Employee

(

FirstName varchar(20),

LastName Varchar(20),

DepartMent varchar(20)

)

--insert few records

Insert Into #Employee

SELECT 'Ritesh','Shah', 'MIS' UNION ALL

SELECT 'Rajan','Shah','ACCT' UNION ALL

SELECT 'Rajan','Mehta','ACCT' UNION ALL

SELECT 'Alka','Shah','MIS'

--simple Row_Number with Order By First Name, Last Name

--this will not make any partition and simply give row number to every row

SELECT ROW_NUMBER() over(order by FirstName,LastName) as Num,FirstName,LastName,DepartMent

FROM #Employee

--this will make a partition on First Name

--so, every first instance of FirstName will have row number 1

--you can find duplicate records with this way too. :)

SELECT ROW_NUMBER() over(Partition by FirstName order by FirstName,LastName) as Num,FirstName,LastName,DepartMent

FROM #Employee

Rank() and Dense_Rank(): These two functions mainly used to use for giving Rank to each row. You may use it for finding Toppers based on examination results records set or maybe use it for finding few top vendors based on track records of sales you have etc. There is only one small but technically big difference between Rank() and Dense_Rank() functions which I am going to show you in practical script below which will be easy to evaluate as I am going to show you all possible partitioning function in one T-SQL.

--create temp table for demo

IF OBJECT_ID('tempdb..#SampleOrder') is not null DROP TABLE #SampleOrder

Create Table #SampleOrder

(

OrderID Int Identity(1,1),

ClientID int,

TotalSample int,

SampleDate datetime

)

--insert few records

Insert Into #SampleOrder

SELECT 1,2,GETDATE()-5 UNION ALL

SELECT 2,5,GETDATE()-8 UNION ALL

SELECT 1,22,GETDATE()-3 UNION ALL

SELECT 3,2,GETDATE()-1 UNION ALL

SELECT 1,2,GETDATE()-5

SELECT *,

ROW_NUMBER() over(order by TotalSample) as RowNum,

ROW_NUMBER() over(Partition By ClientID order by TotalSample) as RowNumP,

Rank() over(order by TotalSample) as Ran,

Rank() over(Partition By ClientID order by TotalSample) as RanP,

Dense_Rank() over(order by TotalSample) as DRan,

Dense_Rank() over(Partition By ClientID order by TotalSample) as DRanP

FROM #SampleOrder

If you will observe output of above T-SQL, you will get to know the difference between Rank and Dense_Rank. There is only difference, if you will get same instance based on partition, rank will give same number all, suppose we get three same instance (same clientID three time with same value) rank will give it, suppose 1 for all three and when next instance come, rank will give it 4 rather than 2. In Dense_Rank, you will get 2, it won’t break the chain.

Reference: Ritesh Shah
http://www.sqlhub.com
Note: Microsoft Books online is a default reference of all articles but examples and explanations prepared by Ritesh Shah, founder of http://www.SQLHub.com

Use of RANK() and Partition by clause in SQL-Server 2005

Tuesday, March 3, 2009 |

Posted by Ritesh Shah

Use of RANK() and Partition by clause in SQL-Server 2005
We will start by creating one table for demonstration and enter some records in it.

--Create one table

CREATE TABLE BlogCount

(

BloggerName VARCHAR(10),

Topic VARCHAR(15),

[Year] INT,

Total INT

)

--Insert records in above table

INSERT INTO BlogCount VALUES('Ritesh','SQL',2005,10)

INSERT INTO BlogCount VALUES('Ritesh','SQL',2006,17)

INSERT INTO BlogCount VALUES('Ritesh','SQL',2007,124)

INSERT INTO BlogCount VALUES('Ritesh','SQL',2008,124)

INSERT INTO BlogCount VALUES('Ritesh','.NET',2008,24)

INSERT INTO BlogCount VALUES('Alka','SQL',2007,14)

INSERT INTO BlogCount VALUES('Alka','.NET',2007,18)

INSERT INTO BlogCount VALUES('Alka','SQL',2008,14)

Once, you are done with above task, let’s think of usability of RANK() and Partition by clause in our real world. Suppose you wish to get record set of all the blogger with their highest total of blog. What will you do? You may use sub-query or/and GROUP BY clause. Right???? Doesn’t it seems tedious and bit difficult? Here is easy solution for the same.

First we will use Rank() and Partition By clause and then we will filter our record set. Partition is like Group by, we want list sorted based on “BloggerName” so we will be putting it in Partition by clause and Order by clause in below query will give Ranking 1 to highest “Total”.

SELECT BloggerName,Topic,[Year],Total,

Rank() OVER (Partition by BloggerName Order by Total DESC) as 'Ranking'

FROM

BlogCount

After running above query you will get result set like:

BloggerName Topic Year Total Ranking

----------- --------------- ----------- ----------- --------------------

Alka SQL 2007 18 1

Alka .NET 2007 18 1

Alka SQL 2008 14 3

Ritesh SQL 2007 124 1

Ritesh SQL 2008 124 1

Ritesh .NET 2008 24 3

Ritesh SQL 2006 17 4

Ritesh SQL 2005 10 5

(8 row(s) affected)

Now, we may like to see only highest total article by bloggers, we can’t put direct “where condition” in above query so we will wrap it up like:

SELECT *

FROM

(

SELECT BloggerName,Topic,[Year],Total,

Rank() OVER (Partition by BloggerName Order by Total DESC) as 'Ranking'

FROM

BlogCount

) Testing

where Ranking<2

Here is the output:

BloggerName Topic Year Total Ranking

----------- --------------- ----------- ----------- --------------------

Alka SQL 2007 18 1

Alka .NET 2007 18 1

Ritesh SQL 2007 124 1

Ritesh SQL 2008 124 1

(4 row(s) affected)

See above result set, we got only 4 records out of 9, which have the highest “Total”. See “Ritesh’s” record # 3 and #4. Both belongs to SQL and has same total so in that case, I may wish to see only one records of year 2008. In this case, year is a tie breaker. Use below given query for breaking a tie.

SELECT *

FROM

(

SELECT BloggerName,Topic,[Year],Total,

Rank() OVER (Partition by BloggerName Order by Total DESC, [year] DESC) as 'Ranking'

FROM

BlogCount

) Testing

where Ranking<2

Now, you will get following output.

BloggerName Topic Year Total Ranking

----------- --------------- ----------- ----------- --------------------

Alka SQL 2007 18 1

Alka .NET 2007 18 1

Ritesh SQL 2008 124 1

(3 row(s) affected)

How easy is this???? Much much better than using sub-query and group by.

Reference: Ritesh Shah

Windowing Partition Functions (Rank(), Row_Number() and Dense_Rank())

Use of RANK() and Partition by clause in SQL-Server 2005

My Book

Latest Article in Extreme-Advice

Ask Me SQL Server question

Recent comment here

Comments for Ritesh's Blog for SQL-SERVER & .NET

Pages

About Me

Short Tips on Extreme-Exchange

Quiz Master

Blog Roll

Visitors

Categories

-