Use of RANK() and Partition by clause in SQL-Server 2005

Tuesday, March 3, 2009 |

Use of RANK() and Partition by clause in SQL-Server 2005
We will start by creating one table for demonstration and enter some records in it.
--Create one table
CREATE TABLE BlogCount
(
BloggerName VARCHAR(10),
Topic VARCHAR(15),
[Year] INT,
Total INT
)

--Insert records in above table
INSERT INTO BlogCount VALUES('Ritesh','SQL',2005,10)
INSERT INTO BlogCount VALUES('Ritesh','SQL',2006,17)
INSERT INTO BlogCount VALUES('Ritesh','SQL',2007,124)
INSERT INTO BlogCount VALUES('Ritesh','SQL',2008,124)
INSERT INTO BlogCount VALUES('Ritesh','.NET',2008,24)
INSERT INTO BlogCount VALUES('Alka','SQL',2007,14)
INSERT INTO BlogCount VALUES('Alka','.NET',2007,18)
INSERT INTO BlogCount VALUES('Alka','SQL',2008,14)

Once, you are done with above task, let’s think of usability of RANK() and Partition by clause in our real world. Suppose you wish to get record set of all the blogger with their highest total of blog. What will you do? You may use sub-query or/and GROUP BY clause. Right???? Doesn’t it seems tedious and bit difficult? Here is easy solution for the same.
First we will use Rank() and Partition By clause and then we will filter our record set. Partition is like Group by, we want list sorted based on “BloggerName” so we will be putting it in Partition by clause and Order by clause in below query will give Ranking 1 to highest “Total”.
SELECT BloggerName,Topic,[Year],Total,
Rank() OVER (Partition by BloggerName Order by Total DESC) as 'Ranking'
FROM
BlogCount
After running above query you will get result set like:
BloggerName Topic Year Total Ranking

----------- --------------- ----------- ----------- --------------------
Alka SQL 2007 18 1
Alka .NET 2007 18 1
Alka SQL 2008 14 3
Ritesh SQL 2007 124 1
Ritesh SQL 2008 124 1
Ritesh .NET 2008 24 3
Ritesh SQL 2006 17 4
Ritesh SQL 2005 10 5

(8 row(s) affected)
Now, we may like to see only highest total article by bloggers, we can’t put direct “where condition” in above query so we will wrap it up like:
SELECT *
FROM
(
SELECT BloggerName,Topic,[Year],Total,
Rank() OVER (Partition by BloggerName Order by Total DESC) as 'Ranking'
FROM
BlogCount
) Testing
where Ranking<2
Here is the output:
BloggerName Topic Year Total Ranking
----------- --------------- ----------- ----------- --------------------
Alka SQL 2007 18 1
Alka .NET 2007 18 1
Ritesh SQL 2007 124 1
Ritesh SQL 2008 124 1

(4 row(s) affected)
See above result set, we got only 4 records out of 9, which have the highest “Total”. See “Ritesh’s” record # 3 and #4. Both belongs to SQL and has same total so in that case, I may wish to see only one records of year 2008. In this case, year is a tie breaker. Use below given query for breaking a tie.
SELECT *
FROM
(
SELECT BloggerName,Topic,[Year],Total,
Rank() OVER (Partition by BloggerName Order by Total DESC, [year] DESC) as 'Ranking'
FROM
BlogCount
) Testing
where Ranking<2
Now, you will get following output.
BloggerName Topic Year Total Ranking
----------- --------------- ----------- ----------- --------------------
Alka SQL 2007 18 1
Alka .NET 2007 18 1
Ritesh SQL 2008 124 1

(3 row(s) affected)
How easy is this???? Much much better than using sub-query and group by.
Reference: Ritesh Shah

1 comments:

Ritesh Shah said...

I have been asked one question by Raju at my other blog, but answering the same here as that word press blog is dead now.

http://riteshshah.wordpress.com/2009/03/03/use-of-rank-and-partition-by-clause-in-sql-server-2005/

Here is the answer:

create table #Temp
(
Name varchar(10),
[Time] int,
age int
)
GO

insert into #Temp
select 'Steve', 12, 33 UNION ALL
select 'Tim', 34 ,28 UNION ALL
select 'Mark', 22 ,37 UNION ALL
select 'Tom', 21 ,30 UNION ALL
select 'Cliff', 13, 33 UNION ALL
select 'Vini', 17, 28 UNION ALL
select 'Matt', 10 ,28 UNION ALL
select 'Ben', 9 ,29 UNION ALL
select 'Brandon', 15, 14


select Name,[Time],Age,row_number() over(Order By [Time]) from #Temp