Skip to content

David Zhang's BI Blog

Problems and solutions identified in my BI practise

Archive

Category: SSIS

You can use an Excel connection in 64bit environment. Go to the package configuration properties.

Debugging -> Debugging Options -> Run64BtRuntime -> change to False In addition if you use SQL Agent go to the job step properties and then check the 32 bit runtime.

Problem:

I created a package that extract data from an Excel file into table in a database with BIDS 2008. When I am trying to preview the the data from Excel Source in Data Flow Task, I get the following error:

===================================

There was an error displaying the preview. (Microsoft Visual Studio)

===================================

Index and length must refer to a location within the string.
Parameter name: length (mscorlib)

Why:

Space in Excel worksheet name

Solution: 

deleting Spaces in Excel worksheet name

  1. Creat two variables with string data type  in the package,set the default values as xml file path and xml schema path
  2. In  CONTROL FLOW tab, select the data flow step that contains the XML Source step (Do not go into the dataflow, it must be done in control flow).
  3. Bring up its properties (right-click > Properties)
  4. Click the elipsis button next to EXPRESSIONS
  5. Select the [XML Source].[XMLSchemaDefinition] property then hit the elipsis button to the right of the Expression textbox.
  6. Drag and Drop your variable that holds your XSD path into the expression box, you should be able to evaluate expression here.
  7. Do similar to XML source file
  8. Click OK
  9. Click OK again
  10. Done

Surrogate Int keys   are used as identifiers for dimensional members and enable us to manage slowly changing dimensions.

SSIS does not contain a built in component for generating surrogate keys but there is still a mechanism for doing it – the Script Component. The Script Component allows us to modify the data in a data flow path using managed code and we can use it to generate surrogate keys.

1. Creating the transformation

When you drag a script component from the toolbox to the design surface you will be prompted as to whether the component is to be a source adapter, a destination adapter or a transformation. Select Transformation.

Select Script Component Type

2. Configuring the transformation metadata

Your script component should contain 1 input and 1 output. This is to be a synchronous transformation so the SynchronousInputID property of the output must be the same as the ID property of the input. The output should have 1 column which in this instance I have called SK.

In the case of a synchronous transformation no output buffer is required; the input buffer is used by the output. Therefore, any columns that we create on our output actually appear as if they are part of the input buffer as we will see later in the script

A good rule of thumb for synchronous script transformations is that you should only add columns to the output which do not already exist in your input because the input columns will flow through the script transformation in any event

Inputs and Outputs

Note in the screenshot above the SynchronousInputID property of the output is set to the ID of the Input thus indicating that this is a synchronous transformation.

3. Building the script

We are going to use managed code to populate SK with an integer that acts as a surrogate key.

The code looks like this:

mports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper

<Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute> _
<CLSCompliant(False)> _
Public Class ScriptMain
Inherits UserComponent
Dim counter As Integer

Public Sub New()
counter = 0
End Sub
Public Overrides Sub PreExecute()
MyBase.PreExecute()
End Sub

Public Overrides Sub PostExecute()
MyBase.PostExecute()
End Sub

Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

‘ Add your code here

counter += 1
Row.SK = counter

End Sub

End Class

There are four things to note about what is going on here:

  • We have declared a variable called counter that will contain a value to be outputted as the surrogate key.
  • We have initialized counter within the New() method which gets called just once on each execution
  • We increment counter within the Input_ProcessInputRow() method which gets called for each row in the input buffer
  • We are outputting the contents of the input buffer to the component output along with our surrogate key value from counter