Posted by on July 3, 2016

As a Microsoft Certified Solutions Expert in Business Intelligence, I work a lot with SQL Server Integration Services (SSIS).  Most of the SSIS code that I encounter performs tasks for which SSIS is not well suited.

Microsoft says that SSIS is a niche tool.  When SSIS was first envisioned, Fourth Generation Programming Languages (4GL) were a prominent paradigm.  The idea behind 4GLs was that programs could be developed more efficiently in specialised, graphical, domain-specific languages.  But 4GLs failed to deliver on that core promise, largely because they are opaque, which makes them difficult to white-box test and peer-review. It’s also very difficult to innovate in a 4GL.

Many of my peers regard SSIS as the tool of last resort.  I’ve won work by being willing to consider that SSIS might be the right tools when others took a harder line.  The problem with SSIS is that it is harder write and test than stored procedures or C#.  Visual Studio database projects can run static code analysis (see: TSQL smells).  tSQLt is great for unit testing stored procedures.  There are even unit test coverage tools.  SSIS doesn’t have any of these things.  I literally have to discover unused variables and untested features by staring at the code.  Not to mention the defects that come from concepts that don’t even exist in ordinary programming languages (disabled components come to mind).

SSIS should not be used when:

  1. The source and destination are on the same SQL Server Instance.
  2. A BULK INSERT works (which usually true, especially now that JSON support has been added in SQL 2016).
  3. Data is streaming rather than batch.
  4. the source is a database to which Change Data Capture can be applied.

I’m not saying that SSIS can’t be used for these jobs.  I am saying that there are better ways to do these jobs.

Which leaves the use cases for which SSIS is well suited:

  1. Data that isn’t in a flat-file, XML, or JSON format.
  2. Databases that don’t support Change Data Capture.

Most warehouses spend most of their effort in ETL.  Most warehouses can be more efficient with their ETL development effort by not using ETL tools where they aren’t necessary.

Posted in: Theory


Be the first to comment.

Leave a Reply