Data transformation services: Data sources and deployment of packages

I notice that when deploying a package that uses data sources behind its connection managers, the deployed package has the connection string built within the package that gets deployed.

From a previous post from Kirk - "What I'd suggest is that you create a Data source per machine and have all the connections reference it. You'll still need to create a connection manager in each package, but they will all reference the same data source. Moving to different machines is no problem because you can have a "machine resident" data source file that properly points to your "ETL Server"."

How are data sources defined for a machine? Is this simply via one of the configuaration options (i.e. an .xml file) that are loaded upon runtime by a package that is built to reference that config, which then trumps the connection string defined within the package?

ThanksDatasources are a design time feature. The connection manager references the DS and caches the connection string so that even if the datasource is no longer available, the connection will still be viable.

For cases where you need to deploy to a location where you will not be opening the package in the designer, and you need the connection string to change, you'll need to configure the connection manager in the package. The data source will only modify the connection in the designer.

The point of the earlier post was that if you're designing multiple packages and you wish to have all the connections in those packages pointing to the same server, you can create datasource and reference it by the DataSourceID in all the packages. If the server changes, you can modify one data source and it will modify all the connections for you. Next time you open the packages, they will pick up the new connection string. Moving the package to a different machine is also no problem so long as there is a data source available on the target machine with the same DataSourceID.

Configurations are your first line of defense for making packages location independent for easier deployment and your best option for packages that won't be opened in the designer. The DataSource approach is just a one off thing you can do in some cases.

K|||

>> Moving the package to a different machine is also no problem so long as there is a data source available on the target machine with the same DataSourceID.
>> Configurations are your first line of defense for making packages location independent for easier deployment and your best option for packages that won't be opened in the designer.

When I'm talking about deployment, I'm thinking test/prod etc where I'll never need to open it in Designer. That's what the dev env is for. Isn't it?

Given that, package configurations look like the "only option" and not just a "best option", and "only line of defence", not "first line of defense". Isn't that true? Basically, if I don't plan on opening my packages in Designer, then I have to go through the painful process of configuring each package separately and entering values for each package at deployment time. Is there an alternative?

Thanks,

RV.

|||

We are required to do a silent install in production at client sites, ie. no DBA or designer access. In this case, we develop in our environment and may would have to deploy each package with its connection strings. This may impact our design to minimize the number of packages.

Any thoughts?

|||

Use configurations. For each site you will need to adjust the configuration data store to hold the correct values for that site, and how you that depends on the store you use. You could write a wizard where the person running the install simply supplies the values and these are written to the correct location. Your packages will already be configured to read their information from configurations so everything should be sweet. Indirect configurations may be easier, and the values set by the install wizard.

Some links-

http://blogs.conchango.com/jamiethomson/archive/2005/11/02/2342.aspx

http://blogs.conchango.com/jamiethomson/archive/2005/10/31/2336.aspx

http://msdn2.microsoft.com/en-us/library/ms141682.aspx

http://www.sqljunkies.com/WebLog/knight_reign/archive/2004/12/07/5445.aspx

|||

I went through all this frustration myself and eventually settled on a solution that works very well. In a nutshell, I store my connection strings for all my connection managers in the database.

1) All my connection managers are of type OLEDB

2) Every package that has a connection string has package configurations enabled with a Configuration entry of type SQL Server that points to a Configuration table in the database. This entry is the way the connection manager gets the connection string at runtime. In the Configuration entry, it is the Configuration filter setting which will differentiate the different entries in the database pertaining to different connection strings.

3) My deployment scripts insert the correct database connection info into the configuration database for each connection. The connection strings are dependent on the environment I'm deploying to. The connection managers in the solution will at runtime grab the value from the database via the package configurations on the package the connection managers live on.

4) I deploy using dtsutil and make it silent. No need to not have it silent.

Thursday, March 29, 2012

Data sources and deployment of packages

No comments:

Post a Comment

Data transformation services

Blog Archive

About Me