Skip to content

David Zhang's BI Blog

Problems and solutions identified in my BI practise

That is what i find out ,

if you extract time  ( say, start, end)  stored as float from sql server 2000  database ,

date(end)-date(start)   returns  the time difference measured by number of days in float format   (say.  0.0625  = 90m)

(date(end)-date(start)) * 24  returns the hour difference in float format

(date(end)-date(start))*24*60 returns the minute difference in float format

Both Qlikvew and SQl Server can store datetime ( datetime of sql server queals to timestamp of Qlikview) in float format. when change the datatype from float to datetime , there will be 2 days’ difference .

for example , in sql server , select CAST ( 40358 as datetime)   returns 2010-07-01 00:00:000,

but in Qlikvew,   expression timestamp( 40358)  = 2010-06-29 12:00:000 am.

Qlikview  gets he same result as what Excel gets

Scenario:   we are using Qlikview version 9 and we need to extract data from a .mdb file of MS 2003 database which is protected by a database level password.

 

Problem:  if we are using 64bit Qlikview, you may not be able to  find any   64-bit compatible ODBC driver for Microsoft Access  in Qlikeview’s ‘Data Link Properties’ window. There is  a Microsoft Access Database Engine 2010 Redistributable, but it is for  reading  data from and writing  data to Office 2010 system files such as Microsoft Access 2010 (mdb and accdb) files and Microsoft Excel 2010 (xls, xlsx, and xlsb) files. Further more,.Alought QlikView works with both 32-bit and 64-bit ODBC drivers. the 32-bit version of QlikView will only work with the 32-bit ODBC

drivers and 64-bit QlikView will only work with the 64-bit ODBC drivers. Therefore we have to use the 32-bit version of QlikView now.

 

Solution:

 

 To check what 32-bit ODBC drivers you have installed, Run C:\Windows\SysWOW64\odbcad32.exe

 

If you don’t have Microsoft SQL ODBC Driver  and Microsoft Jet ODBC Driver,

download it from here.

 

 If the  .mdb file has a database level password , then you have to use Microsoft Jet ODBC Driver. Following the following steps

 

  1. Start  32 bit Qlikview,  and go to  Edit Script, on the data tap of  Edit Script Window, make sure  OLDB in the database section is checked , then Click Connect, The ‘Data Link Properties’ window will open.
  2. On the provider tap, choose Microsoft Jet ODBC Driver.
  3. On the Connection tap, select or enter a database name and path
  4. On the All tap,  input the database password
  5. Click button to till Create Select Statement windows open.

 Qlik1

 

 

 

 reference links

  1. http://social.msdn.microsoft.com/forums/en-US/netfx64bit/thread/dfbd0406-bc69-436e-998b-7fc454012f91/

http://msdn.microsoft.com/en-us/library/cc280478.aspx

You can use an Excel connection in 64bit environment. Go to the package configuration properties.

Debugging -> Debugging Options -> Run64BtRuntime -> change to False In addition if you use SQL Agent go to the job step properties and then check the 32 bit runtime.

Problem:

I created a package that extract data from an Excel file into table in a database with BIDS 2008. When I am trying to preview the the data from Excel Source in Data Flow Task, I get the following error:

===================================

There was an error displaying the preview. (Microsoft Visual Studio)

===================================

Index and length must refer to a location within the string.
Parameter name: length (mscorlib)

Why:

Space in Excel worksheet name

Solution: 

deleting Spaces in Excel worksheet name

http://www.esnips.com/doc/3bb26a48-0608-4ae9-ac85-efe1d8d545cf/Microsoft-Press-Inside-Microsoft-Sql-Server-2005-T-Sql-Querying-Apr-2006

  1. Creat two variables with string data type  in the package,set the default values as xml file path and xml schema path
  2. In  CONTROL FLOW tab, select the data flow step that contains the XML Source step (Do not go into the dataflow, it must be done in control flow).
  3. Bring up its properties (right-click > Properties)
  4. Click the elipsis button next to EXPRESSIONS
  5. Select the [XML Source].[XMLSchemaDefinition] property then hit the elipsis button to the right of the Expression textbox.
  6. Drag and Drop your variable that holds your XSD path into the expression box, you should be able to evaluate expression here.
  7. Do similar to XML source file
  8. Click OK
  9. Click OK again
  10. Done

Surrogate Int keys   are used as identifiers for dimensional members and enable us to manage slowly changing dimensions.

SSIS does not contain a built in component for generating surrogate keys but there is still a mechanism for doing it – the Script Component. The Script Component allows us to modify the data in a data flow path using managed code and we can use it to generate surrogate keys.

1. Creating the transformation

When you drag a script component from the toolbox to the design surface you will be prompted as to whether the component is to be a source adapter, a destination adapter or a transformation. Select Transformation.

Select Script Component Type

2. Configuring the transformation metadata

Your script component should contain 1 input and 1 output. This is to be a synchronous transformation so the SynchronousInputID property of the output must be the same as the ID property of the input. The output should have 1 column which in this instance I have called SK.

In the case of a synchronous transformation no output buffer is required; the input buffer is used by the output. Therefore, any columns that we create on our output actually appear as if they are part of the input buffer as we will see later in the script

A good rule of thumb for synchronous script transformations is that you should only add columns to the output which do not already exist in your input because the input columns will flow through the script transformation in any event

Inputs and Outputs

Note in the screenshot above the SynchronousInputID property of the output is set to the ID of the Input thus indicating that this is a synchronous transformation.

3. Building the script

We are going to use managed code to populate SK with an integer that acts as a surrogate key.

The code looks like this:

mports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper

<Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute> _
<CLSCompliant(False)> _
Public Class ScriptMain
Inherits UserComponent
Dim counter As Integer

Public Sub New()
counter = 0
End Sub
Public Overrides Sub PreExecute()
MyBase.PreExecute()
End Sub

Public Overrides Sub PostExecute()
MyBase.PostExecute()
End Sub

Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

‘ Add your code here

counter += 1
Row.SK = counter

End Sub

End Class

There are four things to note about what is going on here:

  • We have declared a variable called counter that will contain a value to be outputted as the surrogate key.
  • We have initialized counter within the New() method which gets called just once on each execution
  • We increment counter within the Input_ProcessInputRow() method which gets called for each row in the input buffer
  • We are outputting the contents of the input buffer to the component output along with our surrogate key value from counter

When use sharepoint desinger to  back up , restore  or migrate sharepoint site collections,  contents may be lost . The following is a good solution.

 ”C:\Program Files\Common Files\Microsoft Shared\web server extensions\12\BIN\stsadm.exe” -o export -url http://mydomain1/sites/mysite1  -filename C:\mypath\all.dat -overwrite -includeusersecurity -versions 4

 ”C:\Program Files\Common Files\Microsoft Shared\web server extensions\12\BIN\stsadm.exe”  -o deletesite -url http://mydomain2/sites/mysite2

“C:\Program Files\Common Files\Microsoft Shared\web server extensions\12\BIN\stsadm.exe” -o createsite -url http://mydomain2/sites/mysite2   -ownerlogin administrator -owneremail yongru.zhang@utoronto.ca

“C:\Program Files\Common Files\Microsoft Shared\web server extensions\12\BIN\stsadm.exe” -o import -url http://mydomain2/sites/mysite2  -filename C:\david\sharepoint\all.dat -includeusersecurity

The following is a step by step demo, which i delivered to teach other ETL people in my company, 

Create a ETL package, which move data in one table (Service_Provider_Master) to another table (temp_Sevice_Proivider_Master) in the local database

 

Step.1 , On Local Server (WKYYZ23-ARMS20),  Create a new  ETL project, save the solution in C:\David\David_Deploy_Example ,  rename the default  package as table_to_table_data_move.dtsx.

 

Step.2 Create a new connection manager (WKYYZ23-ARMS20.airportBIDM_ODS) to connect to database airportBIDM_ODS

 

Step.3 Create a dataflow task which move data in one table (Service_Provider_Master) to another table (temp_Sevice_Proivider_Master) in  the  local database (airportBIDM_ODS_15jun09-Full) .

 

Create the configuration file for the package which you want to deploy to another server.

Step.4   In the “Control Flow” tab, right click on the design surface and select “Package Configurations…” from the pop-up menu.

Step.5  . Tick the check box “Enable package configurations”

Step 6. Click the “Add…” button, to open the “Package Configuration Wizard”

Step 7 . In the “Configuration Type” combo box select “XML Configuration File”

Step 8 . Click “Browse…” to select a location for the XML configuration file

Step 9. Specify a file name in the “Select Configuration File Location” dialog box and click Save. Here I selected C:\David\David_Deploy_Example\table_table_data_move_config.dtsConfig

Step 10 . Click “Next” in the package configuration wizard

Step 11 . In the “Objects” tree structure, expend folder “connection managers”…… “WKYYZ23-ARMS20” ……properties, tick the objects “connection string”  for which you want to generate the configuration file, and click “Next”,

Step 12 . Provide a name for this configuration,       table_to_table_data_move_config_file and click “Finish”

Step 13 In Windows explore, go to C:\David\David_Deploy_Example and open file “table_table_data_move_config.dtsConfig” with any XML editorL:

 

<?xml version=”1.0″?>

<DTSConfiguration>

       <DTSConfigurationHeading>

              <DTSConfigurationFileInfo GeneratedBy=”WKYYZ23-ARMS20\Arms” GeneratedFromPackageName=”table_to_table_data_move” GeneratedFromPackageID=”{1EB78EDA-E5D7-4843-90CE-4203730E9288}” GeneratedDate=”11/5/2009 1:40:22 PM”/>

       </DTSConfigurationHeading>

       <Configuration ConfiguredType=”Property” Path=”\Package.Connections[WKYYZ23-ARMS20.airportBIDM_ODS].Properties[ConnectionString]” ValueType=”String”>

              <ConfiguredValue>Data Source=WKYYZ23-ARMS20;Initial Catalog=airportBIDM_ODS_15jun09-Full;Provider=SQLNCLI10.1;Integrated Security=SSPI;Application-table_to_table_data_move-{77EF74ED-52CF-4B5A-B785-5453E3462CE9}WKYYZ23-ARMS20.airportBIDM_ODS_15jun09-Full;Auto Translate=False;</ConfiguredValue>

       </Configuration>

</DTSConfiguration>
 
 

Step 14  Copy the parckage and configuration file to antoehr machine (AC-6F689A61BAA3) on which SQL Server 2008 is installed and the database we want to use  is airportBIDM_ODS_backup

 

Step 15, On the target machina   Open the configuration file and change the ConfiguredValue as following
<ConfiguredValue>Data Source= AC-6F689A61BAA3;Initial Catalog= airportBIDM_ODS_backup;Provider=SQLNCLI10.1;Integrated Security=SSPI;Application-table_to_table_data_move-{77EF74ED-52CF-4B5A-B785-5453E3462CE9}WKYYZ23-ARMS20.airportBIDM_ODS_15jun09-Full;Auto Translate=False;</ConfiguredValue

Appendix

Guidelines on using XML configuration files in SSIS:

Connection and XML configuration file naming conventions: When creating a new OLE DB database connection, use the following naming convention for your connections:

SERVERNAMEdatabasename

Note: You will have to rename your connection in the connection manager, after the connection is created.

For example, a connection to a server named LDNSrv1 and MyDB1 database should look like:

LDNSRV1mydb1

Notice that the server name is in CAPS and the database name is in lower case. However, this is just a guideline, and feel free to go with the naming convention that you are comfortable with.

Remember to follow the same casing standard everywhere, as SSIS connection names are case sensitive.

Use the same naming convention for the XML configuration files. Make sure your XML configuration files have the .dtsConfig extension. By default SSIS looks for this extension. But any other file extension will work too.

For example, the XML configuration file for Nyk1 server and Stocks database should be named:

NYK1stocks.dtsConfig

Managing XML configuration files: Store all the XML configuration files in a standard location on all servers. For example, at F:\UDL folderCreate a separate XML configuration file for each server and database combination. This approach is better than storing multiple server configurations in a single XML file. The reason being, if you need to edit connectivity information for a single server, you will directly edit the XML file for that server only. This reduces the chances of accidentally changing information for other servers.

 

If you are connecting to more than one server in your SSIS package, add the relevant XML configuration file for each server and database combination. You can add multiple configuration files from the “Package Configurations” screen, as shown in the below image:

For example, if your package is connecting to Srv1.db1 and Srv2.db2, then you would add the following XML configuration files to your package:

SRV1db1.dtsConfig
SRV2db2.dtsConfig

Always reuse existing XML configuration files, if one exists for your server and database combination. This is the sole reason behind configuration files, as you will have to update connectivity information in only one place. If there isn’t a configuration file already created for your server and database combination, create a new configuration file using the below templates and follow the described naming convention.

For Windows authenticated connection to SQL Server:

 
<DTSConfiguration>
               <Configuration ConfiguredType="Property" Path="\Package.Connections[SERVERNAMEdbname].Properties[ConnectionString]" ValueType="String">
                              <ConfiguredValue>Data Source=[SRVNAME];Initial Catalog=[DBNAME];Provider=SQLNCLI.1;Integrated Security=SSPI;Auto Translate=False;</ConfiguredValue>
               </Configuration>
</DTSConfiguration>
 

For SQL Server authenticated connection to SQL Server:

 
<DTSConfiguration>
               <Configuration ConfiguredType="Property" Path="\Package.Connections[SERVERNAMEdbname].Properties[ConnectionString]" ValueType="String">
                              <ConfiguredValue>
                                              Data Source=[SRVNAME];Initial Catalog=[DBNAME];User ID=[LOGIN];password=[PWD];Provider=SQLNCLI.1;Persist Security Info=True;Auto Translate=False
                               </ConfiguredValue>
               </Configuration>
</DTSConfiguration>
 

Note: In the above templates, square brackets [] are not required except around the name of the connection. Do not surround server name, database name, user name and password with square brackets

Populating global variables using XML configuration files: When dealing with Active directory shares or file paths (for connections to source and target text files or data files), do not hardcode the domain names or machine names. Instead, use variables. These variables can be populated from the XML configuration files. Once a variable is created in the SSIS package, its value can be populated from the XML configuration file. For example, here’s what you need in the configuration file, to populate the variable called Domain with a value of “NykLive”:

 
<DTSConfiguration>
               <Configuration ConfiguredType="Property" Path="\Package.Variables[User::Domain].Properties[Value]" ValueType="String">
                               <ConfiguredValue>NykLive</ConfiguredValue>
               </Configuration>
</DTSConfiguration>

Note: Variable names are case sensitive too. So make sure you follow the same case when referring to the variables in the configuration files, script tasks etc.