Using SSIS Pivot Transform When Pivot Column is Not Available

Pivoting is typically used in DW solutions to load multi-valued dimensions. In an earlier post, I demonstrated how SSIS Pivot component works and the advantages of using SSIS Pivot component over its equivalent T-SQL Pivot statement. In this post, I explain a technique to Pivot data when Pivot Column is not explicitly available. Continue Reading

SSIS Pivot Transform vs T-SQL Pivot

Pivot transformation converts multiple rows of data into a single record by using an input column that has a finite set of distinct values. Pivoting is typically used in DW solutions to handle multi-valued dimensions. In SQL Server, Pivoting is done using either      

  1.  T-SQL Pivot statement or
  2.  SSIS Pivot Transformation

  
There are two significant benefits that stand-out in favour of using SSIS Pivot Transformation for ETL scenarios. The remainder of this post explains just that. Continue Reading

Alternatives to SSIS SCD Wizard Component

SSIS comes with an out-of-box SCD Wizard to handle Type 1 and Type 2 Slowly Changing Dimensions (SCD) which is a fundamental ETL requirement. However the SCD wizard component has some serious drawbacks – both from operational and functional perspectives that make it unusable for practical purposes. A good summary on the shortcomings of SCD Wizard component can be found here

Several workarounds have evolved over time and in this post I would like explore the different alternative options to handle Type 1 and Type 2 SCD without using the out-of-box SCD Wizard Component.  

Continue Reading

SSIS Package Deployment Method – File System or Server ?

After many implementations, I have a strong inclination and preference to deploy SSIS packages to File System rather than to the Server. File System method of deployment is clean and elegant. It encourages package reusability and in some instances it’s the only feasible option. Here are two common ETL scenarios to support my view: Continue Reading

Row Insert from SSIS package Vs Transact-SQL Statements

SQL Server writes transaction logs sequentially. The size of transaction logs when using sql INSERT statement depends on the Recovery Model of the database and the type Transact-SQL statements used. Database recovery model can be – Simple (min), Bulk-Logged (intermediate) or Full (max). While this is true for Transact-SQL INSERT statements, records inserted through an SSIS Data Flow Destination (like SQL Server Destination) exhibit a different logging behaviour. This is because the INSERT statements issues from Integration Services engine is always minimally logged and it bypasses the recovery model of the database engine. This results in significant performance improvement by reducing the sequential disk writes, especially when dealing with huge recordsets which is typical in a data warehouse ETL scenario. That’s another reason why SSIS packages perform better over Transact-SQL Stored Procedures for ETL purposes. Continue Reading

How to Create Raw File for Use as SSIS Data Source?

The use of raw files as SSIS data source is highly recommended for huge volume of data due to its superior read/write performance. Raw files store data in a very basic format that requires almost zero translation and parsing. This enables faster data reads/writes when compared to Flat Files and even faster than Database tables. Continue Reading

SSIS: Credit Card Number Validator (CCNV)

CodePlex
CodePlex_hstd

Credit Card Number Validator (CCNV)  is a Custom SSIS Data Flow Transformation Component that determines whether the given input is a valid credit card number or not. CCNV takes String/Numeric input and performs a Synchronous, Non-Blocking transformation using Luhn Mod 10 algorithm and validates whether the input is a valid credit card number or not. The output is true if the input is a valid credit card number and false otherwise. CCNV08 is compatible with SQL Server 2008 while CCNV05 is compatible with SQL Server 2005. Contnue Reading

Debugging Custom SSIS Components

 CodePlex

Custom SSIS Components consists of Design-time methods and Run-time methods. Debugging a Design-time method is different from a Run-time method. This post explains the techniques to debug Design-time and Run-time methods using the Custom Data Flow Component, Credit Card Number Validator as an example.

Continue Reading…

Custom SSIS Data Flow Component – Credit Card Number Validator (CCNV)

CodePlex

SQL Server Integration Services (SSIS) provides a wide range of out-of-box components to perform almost any ETL task. In addition to out-of-box components, the object model of Integration Services allows you to create re-usable custom components.  Different types of custom components can be developed, including:

  • Custom tasks.
  • Custom connection managers.   Connect to external data sources that are not currently supported.
  • Custom log providers.    Log package events in formats that are not currently supported.
  • Custom enumerators.   Support iteration over a set of objects or values formats that are not currently supported.
  • Custom Data Flow Components.   Can be configured as sources, transformations, or destinations.

In this post, step-by-step instruction is provided to create a Custom SSIS Data Flow Component called Credit Card Number Validator (CCNV). This component validates Credit Card Numbers using Luhn Mod 10 Algorithm. There are two versions of this component available for download from CodePlex:

  1. CCNV08, which is SQL Server 2008 version of Credit Card Number Validator – download
  2. CCNV05, which is SQL Server 2005 version of Credit Card Number Validator – download

Continue Reading…