Merge Join Transformation Task in SSIS 2008

Tuesday, August 3, 2010 |

SSIS is the very crucial tool for Data Warehouse and collecting data from different sources and merge it into single master source is the main task in Data Warehouse.  Merge Join Transformation task is one of the useful tool in SSIS to achieve this. Let us see how it works.

Well, before we start developing package, let us create two different data sources where data come from in single source. We are also going to create one blank table which will hold the data come from two different sources.

Our first source will be SQL Server table.

Our second source will be Flat file which is comma separated.

And the destination of the both sources would be in again SQL Server single table.

Script for First Source:

Create Table OrderMaster
(
      OrderID Varchar(5),
      ClientCode Varchar(10),
      ProjectName varchar(25),
      OrderDate DateTime
)
INSERT INTO OrderMaster
SELECT 'A1001','CHEM02','NY WaterField',GETDATE() UNION ALL
SELECT 'A1002','ACCU01','Plainfield Soil',GETDATE()
GO

Create table in SQL Server database and insert data in that table via above given script.

Second Source:

Create one text file in D drive with name OrderDetails.txt with following data.

OrderID,SampleNumber,Matrix
A1001,A1001-01,Water
A1001,A1001-02,Soil
A1002,A1002-01,Water
A1002,A1002-02,Water

First row in the above data is column name.

Now, let us create destination table.

Create Table OrderMasterAndDetail
(
OrderID Varchar(5),
ClientCode Varchar(10),
SampleNumber Varchar(10),
Matrix Varchar(10)
)
GO

Ok, now we are ready to start developing SSIS package. Open new project in BIDS (Business Intelligence Development Studio). Drag “Data Flow” task from tool box to your “Control Flow” tab and double click on it to configure. As soon as you will double click on that, you will be redirected to the “Data Flow” tab. Now, drag  “ADO Net Source” and “Flat File Source” task to your “Data Flow”. Let us now configure them.
Double click on “Ado Net Source” to configure it and select your database and table to configure it. Look at the below given Image for more details.

Now, double click on “Flat File Source” to configure it. From the “Flat File Source Editor” please click on “New” button to create “Flat File Connection Manager”. Now from “Flat File Connection Manager Editor” give name “OrderDetails” in “Connection Manger Name” property. By clicking on “Browse” button, select your “OrderDetails.txt” file from D drive. Don’t forget to check on the CheckBox “Column name in first data row” as our first row in text file is our column header.  For more details, look at image below.


Now drag “Sort Transformation” task from tool box to your data flow tab and repeat this step to have two “Sort Transformation” task. One is for “Ado Net Source” and second one is for “Flat File Source”. Join both sort task with appropriate green arrow and double click on it one by one to configure it and make ascending sort on “OrderID” field in both the sort task.

Now, drag the merge join task from tool box and drop it below both sort tasks. Connect green arrow from both sort task to “Merge Join” task and double click on “Merge Join Transformation” task to configure it. “OrderID” column in both the source should be work as JOIN key.  Select “OrderID”, “ClientCode” from first sort and “SampleNumber” and “Matrix” from second sort as shown in screen capture below.

Now, you are having data from both the source to your “Merge Join”. You have to send that data to specified location, in our case it is one SQL Server table we have already created with the name “OrderMasterAndDetail”. Since we want to send our merged data to SQL Server, let us now drag “Ado Net Destination” task to our dataflow and connect it to our merge join. Now, this is the time to configure “Ado Net Destination” so double click on it and select your database connection and table named “OrderMasterAndDetail”, look at the screen shot below for more information.

So, finally you are ready to run the package, hit F5 to run that and if every task seems green as below screen capture than you are all set and you have already received data in your destination table.

Check your “OrderMasterAndDetail” table in SQL Server whether you have actually received any data there or not.

Reference: Ritesh Shah
http://www.sqlhub.com
Note: Microsoft Books online is a default reference of all articles but examples and explanations prepared by Ritesh Shah, founder of
http://www.SQLHub.com

0 comments: