Wednesday, April 18, 2012

SSIS: Excel Source - Specifying a Worksheet & Range

Recently I've been spending some of my time working on a data warehousing project. We're building the data warehouse using the complete Microsoft stack: SQL Server 2008 R2 database engine, analysis services, reporting services and integration services for ETL (also throw in some SharePoint 2010 for presentation and BizTalk 2010 for some of the trickier to interact with data sources).

Although a few years back I spent the majority of my time in the BI space I've been concentrating on systems integration for the last few years, so it's been an interesting refresher on BI concepts and the tools available today.

One of the data sources we needed to load data from was an Excel spreadsheet consisting of several worksheets, each worksheet containing similar data for a different year. Unfortunately the top few rows of each worksheet had "header" data that we're not interested in, but which the default Excel data source in SSIS chokes on.

Fortunately, you can specify a worksheet and range using the OpenRowset property: Just specify it in the format [Sheet]$[Start-Cell]:[End-Cell]. So for instance if the worksheet was named "2006" and the data we're interested in was in the cell range A9 through to U300, we would specify a value for the OpenRowset property of "2006$A9:U300". Combine this with the capability to specify a variable-based expression for the OpenRowset property and a ForEach container, and we can loop over each worksheet in the spreadsheet and import the data. How cool is that?!

1 comment:

  1. Dave, Thanks for the Awesome post.

    I have the sheet name as "ESC PRISM Document Approval Rou" on the excel, i need to skip first 4 rows, so my select is like "select * from [ESC PRISM Document Approval Rou$A4:B] " . But it gets me an error . Any help is much appreciated.