Tech in the 603, The Granite State Hacker

Using Client Certs to Pull Data from WCF in SSIS Data Flow Transform Script

I’ve recently had the opportunity to brush off my SSIS skills and revisit this toolset.   In my most recent usage, I had a requirement to use SSIS to pull data from a WCF web service that was a) using the net.tcp protocol, and b) used transport security with a client X.509 certificate for authentication.

This was fun enough by itself.  Configuring WCF tend typcially to be non-trival even when you don’t have to tweak app.config files for SQL SSIS services.  One of my goals, in fact, was to avoid having to update that, meaning I had to put code in my SSIS Script block in the data flow to configure my channel & security & such.

Luckily, I was able to find examples of doing this with wsHttpBinding’s, so it wasn’t a stretch to tweak it for netTcpBinding with the required changes to support certificate authenticated transport security.

Here’s the code…

using System;
usingSystem.Data;
usingMicrosoft.SqlServer.Dts.Pipeline.Wrapper;
usingMicrosoft.SqlServer.Dts.Runtime.Wrapper;
usingSystem.ServiceModel;
usingSC_13defb16ae45414dbac17137434aeca0.csproj.PaymentSrv;


[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
    ChannelFactory<IProfile> channelFactory;
    IProfileclient;
    public override voidPreExecute()
    {
        base.PreExecute();

        boolfireAgain = false;
        this.ComponentMetaData.FireInformation(0, “Pull From Profile Service.PreExecute”, “Service URI: ‘” + this.Variables.varProfileServiceUrl + “‘”, null, 0, ref fireAgain);
        this.ComponentMetaData.FireInformation(0, “Pull From Profile Service.PreExecute”, “Cert Fingerprint: ‘” + this.Variables.varClientCertFingerprint + “‘”, null, 0, ref fireAgain);

        //create the binding
        NetTcpBindingbinding = new NetTcpBinding();
        binding.Security.Mode = SecurityMode.Transport;
        binding.Security.Transport.ClientCredentialType = TcpClientCredentialType.Certificate;
        binding.Security.Transport.ProtectionLevel = System.Net.Security.ProtectionLevel.EncryptAndSign;

       
        EndpointAddressendpointAddress = new EndpointAddress(this.Variables.varPaymentServiceUrl);
        channelFactory = new ChannelFactory<IProfile>(binding, endpointAddress);

        channelFactory.Credentials.ClientCertificate.SetCertificate(
            System.Security.Cryptography.X509Certificates.StoreLocation.LocalMachine,
            System.Security.Cryptography.X509Certificates.StoreName.My,
            System.Security.Cryptography.X509Certificates.X509FindType.FindByThumbprint,
            this.Variables.varClientCertFingerprint);
            //” x8 60 66 09 t6 10 60 2d 99 d6 51 f7 5c 3b 25 bt 2e 62 32 79″);

        channelFactory.Credentials.ServiceCertificate.Authentication.CertificateValidationMode =
            System.ServiceModel.Security.X509CertificateValidationMode.PeerTrust;
       
        //create the channel
        client = channelFactory.CreateChannel();

       
        IClientChannel channel = (IClientChannel)client;

       
        channel.Open();
        this.ComponentMetaData.FireInformation(0, “Pull From Profile Service.PreExecute”, “Open Succeeded.”, null, 0, reffireAgain);


    }

    public override voidPostExecute()
    {
        base.PostExecute();

        //close the channel
        IClientChannelchannel = (IClientChannel)client;
        channel.Close();

        //close the ChannelFactory
        channelFactory.Close();

    }

    public override voidInput0_ProcessInputRow(Input0Buffer Row)
    {
        GuidtxGuid = Guid.NewGuid();
        Profileprofile = null;
        try
        {
            profile = client.getProfile(txGuid, Row.ProfileId);
            Row.PSProfileType = GetProfileType(profile);
           
        }
        catch (Exception ex)
        {
            stringmessage = ex.Message();
            Log(message, 0, null);
        }
       
       
    }
    private string GetProfileType(Profileprofile)
    {
        return “x”;
    }
}

So one of the challenges I encountered while using this method had to do with the client certificate.  This error drove me nuts:

The credentials supplied to the package were not recognized.
Server stack trace:
   at System.Net.SSPIWrapper.AcquireCredentialsHandle(SSPIInterface SecModule, String package, CredentialUse intent, SecureCredential scc)
   at System.Net.Security.SecureChannel.AcquireCredentialsHandle(CredentialUse credUsage, SecureCredential& secureCredential)
   at System.Net.Security.SecureChannel.AcquireClientCredentials(Byte[]& thumbPrint)
   at System.Net.Security.SecureChannel.GenerateToken(Byte[] input, Int32 offset, Int32 count, Byte[]& output)
   at System.Net.Security.SecureChannel.NextMessage(Byte[] incoming, Int32 offset, Int32 count)
   at System.Net.Security.SslState.StartSendBlob(Byte[] incoming, Int32 count, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslState.ProcessReceivedBlob(Byte[] buffer, Int32 count, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslState.StartReadFrame(Byte[] buffer, Int32 readBytes, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslState.StartReceiveBlob(Byte[] buffer, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslState.CheckCompletionBeforeNextReceive(ProtocolToken message, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslState.StartSendBlob(Byte[] incoming, Int32 count, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslState.ForceAuthentication(Boolean receiveFirst, Byte[] buffer, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.SslState.ProcessAuthentication(LazyAsyncResult lazyResult)
   at System.Net.Security.SslStream.AuthenticateAsClient(String targetHost, X509CertificateCollection clientCertificates, SslProtocols enabledSslProtocols, Boolean checkCertificateRevocation)
   at System.ServiceModel.Channels.SslStreamSecurityUpgradeInitiator.OnInitiateUpgrade(Stream stream, SecurityMessageProperty& remoteSecurity)
   at System.ServiceModel.Channels.StreamSecurityUpgradeInitiatorBase.InitiateUpgrade(Stream stream)
   at System.ServiceModel.Channels.ConnectionUpgradeHelper.InitiateUpgrade(StreamUpgradeInitiator upgradeInitiator, IConnection& connection, ClientFramingDecoder decoder, IDefaultCommunicationTimeouts defaultTimeouts, TimeoutHelper& timeoutHelper)
   at System.ServiceModel.Channels.ClientFramingDuplexSessionChannel.SendPreamble(IConnection connection, ArraySegment`1 preamble, TimeoutHelper& timeoutHelper)
   at System.ServiceModel.Channels.ClientFramingDuplexSessionChannel.DuplexConnectionPoolHelper.AcceptPooledConnection(IConnection connection, TimeoutHelper& timeoutHelper)
   at System.ServiceModel.Channels.ConnectionPoolHelper.EstablishConnection(TimeSpan timeout)
   at System.ServiceModel.Channels.ClientFramingDuplexSessionChannel.OnOpen(TimeSpan timeout)
   at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)
   at System.ServiceModel.Channels.ServiceChannel.OnOpen(TimeSpan timeout)
   at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)
   at System.ServiceModel.Channels.CommunicationObject.Open()
Exception rethrown at [0]:
   at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
   at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
   at System.ServiceModel.ICommunicationObject.Open()
   at ScriptMain.PreExecute()
   at Microsoft.SqlServer.Dts.Pipeline.ScriptComponentHost.PreExecute()

If you look at it, this is an authentication error.  Tracing the code, it happens AFTER the code successfully retrieves the client certificate from the certificate store.  The call to SetServerCertificate succeeds without incident.

The error hits  when the code opens the channel, and tries to use the private key attached to the client certificate to prove to the server that “I’m a valid client.”

I went nuts because I was an administrator on the machine, and had installed the client certificate to the certificate store myself.  It initially worked, and there was no indication that there was a problem getting the certificate from the cert store.

It turns out that when you use the machine store under these circumstances, I needed to give myself explicit permission to the client certificate in order for the SetServerCertificate to get the private key along with the client certificate.  This was counter-intuitive for two *additional* reasons:  1)  I was an administrator on the box, and already should have had this permission by the fact that my login account belonged to the administrators group (which you can see from the pic below, also had access.)  2)   It worked the day before.  When I imported the private key originally to the key store, it appears somewhere in the depths of Windows 7 (and this applied on Server 2008 R2 as well) I still had permission in my active session context.  When I logged out, that login context died, and, coming back the next day, I logged in again, not realizing I wouldn’t be able to access the key.  Giving myself explicit permission as shown below allowed me to run my SSIS package within Visual Studio and from SSMS.

(Sorry, Blogger’s not letting me include this bigger… click it for full size view.)
Tech in the 603, The Granite State Hacker

“ETL”ing Source Code

The past couple weeks, I’ve been between projects, which has gotten me involved in a number of “odd jobs”. An interesting pattern that I’m seeing in them is querying and joining, and updating data from very traditionally “unlikely” sources… especially code.

SQL databases are very involved, but I find myself querying system views of the schema itself, rather than its contents. In fact, I’m doing so much of this, that I’m finding myself building skeleton databases… no data, just schema, stored procs, and supporting structures.

I’m also pulling and updating metadata from the likes of SharePoint sites, SSRS RDL files, SSIS packages… and most recently, CLR objects that were serialized and persisted to a file. Rather than outputting in the form of reports, in some cases, I’m outputting in the form of more source code.

I’ve already blogged a bit about pulling SharePoint lists into ADO.NET DataSet’s. I’ll post about some of the other fun stuff I’ve been hacking at soon.

I think the interesting part is how relatively easy it’s becoming to write code to “ETL” source code.

Tech in the 603, The Granite State Hacker

SSIS Bug / Workaround

I ran across a “feature” (bug) in SSIS today that I figured I’d record in case I ran across it again…

Error: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80040E14. An OLE DB record is available. Source: “Microsoft SQL Native Client” Hresult: 0x80040E14 Description: “Cannot fetch a row from OLE DB provider “BULK” for linked server “(null)”.”. An OLE DB record is available. Source: “Microsoft SQL Native Client” Hresult: 0x80040E14 Description: “The OLE DB provider “BULK” for linked server “(null)” reported an error. The provider did not give any information about the error.”. An OLE DB record is available. Source: “Microsoft SQL Native Client” Hresult: 0x80040E14 Description: “Reading from DTS buffer timed out.”.

According to this MSDN forum thread, it’s a bona-fide SSIS bug (I’m running SQL Server SP2 with the latest known patches as of this date.)

So the problem is in the SQL Server Destination. Changing it out for an OLEDB Destination seems to do the trick.

Tech in the 603, The Granite State Hacker

SSIS: Unit Testing

I’ve spent the past couple days putting together unit tests for SSIS packages. It’s not as easy to do as it is to write unit & integration tests for, say, typical C# projects.

SSIS Data flows can be really complex. Worse, you really can’t execute portions of a single data flow separately and get meaninful results.

Further, one of the key features of SSIS is the fact that the built-in data flow toolbox items can be equated to framework functionality. There’s not so much value in unit testing the framework.

Excuses come easy, but really, unit testing in SSIS is not impossible…

So meaningful unit testing of SSIS packages really comes down to testing of Executables in a control flow, and particularly executables with a high degree of programability. The two most significant control flow executable types are Script Task executables and Data Flow executables.

Ultimately, the solution to SSIS unit testing becomes package execution automation.

There are a certain number of things you have to do before you can start writing C# to test your scripts and data flows, though. I’ll go through my experience with it, so far.

In order to automate SSIS package execution for unit testing, you must have Visual Studio 2005 (or greater) with the language of your choice installed (I chose C#).

Interestingly, while you can develop and debug SSIS in the Business Intelligence Development System (BIDS, a subset of Visual Studio), you cannot execute SSIS packages from C# without SQL Server 2005 Developer or Enterprise edition installed (“go Microsoft!”).

Another important caveat… you CAN have your unit test project in the same solution as your SSIS project. Due to over-excessive design time validation of SSIS packages, you can’t effectively execute the SSIS packages from your unit test code if you have the SSIS project loaded at the same time. I’ve found that the only way I can safely run my unit tests is to “Unload Project” on the SSIS project before attempting to execute the unit test host app. Even then, Visual Studio occassionally holds locks on files that force me to close and re-open Visual Studio in order to release them.

Anyway, I chose to use a console application as the host app. There’s some info out there on the ‘net about how to configure a .config file borrowing from dtexec.exe.config, the SSIS command line utility, but I didn’t see anything special in there that I had to include.

The only reference you need to add to your project is a ref to Microsoft.SqlServer.ManagedDTS. The core namespace you’ll need is

using Microsoft.SqlServer.Dts.Runtime;

In my first case, most of my unit testing is variations on a single input file. The package validates the input and produces three outputs: a table that contains source records which have passed validation, a flat output file that contains source records that failed validation, and a target table that contains transformed results.

What I ended up doing was creating a very small framework that allowed me to declare a test and some metadata about it. The metadata associates a group of resources that include a test input, and the three baseline outputs by a common URN. Once I have my input and baselines established, I can circumvent downloading the “real” source file, inject my test source into the process, and compare the results with my baselines.

Here’s an example Unit test of a Validation executable within my SSIS package:

[TestInfo(Name = "Unit: Validate Source, duplicated line in source", TestURN = "Dupes")]
public void ValidationUnitDupeLineTest()
{
using (Package thePackage = _dtsApp.LoadPackage(packageFilePath, this))
{
thePackage.DelayValidation = true;
DisableAllExecutables(thePackage);
EnableValidationExecutable(thePackage);
InjectBaselineSource(GetBaselineResource("Stage_1_Source_" + TestURN), thePackage.Variables["SourceFilePath"]);
thePackage.Execute(null, null, this, null, null);
string errorFilePath = thePackage.Variables["ErrorLogFilePath"].Value as string;
//throw new AbortTestingException();
AssertPackageExecutionResult(thePackage, DTSExecResult.Failure);
AssertBaselineAdjustSource(TestURN);
AssertBaselineFile(GetBaselineResourceString("Baseline_Stage1_" + TestURN), errorFilePath);
}
}

Here’s the code that does some of the SSIS Package manipulation referenced above:


#region Utilities
protected virtual void DisableAllExecutables(Package thePackage)
{
Sequence aContainer = thePackage.Executables["Adjustments, Stage 1"] as Sequence;
(aContainer.Executables["Download Source From SharePoint"] as TaskHost).Disable = true;
(aContainer.Executables["Prep Target Tables"] as TaskHost).Disable = true;
(aContainer.Executables["Validate Source Data"] as TaskHost).Disable = true;
(aContainer.Executables["Process Source Data"] as TaskHost).Disable = true;
(aContainer.Executables["Source Validation Failure Sequence"] as Sequence).Disable = true;
(aContainer.Executables["Execute Report Subscription"] as TaskHost).Disable = true;
(thePackage.Executables["Package Success Sequence"] as Sequence).Disable = true;
(thePackage.Executables["Package Failure Sequence"] as Sequence).Disable = true;
}


protected virtual void DisableDownloadExecutable(Package thePackage)
{
Sequence aContainer = thePackage.Executables["Adjustments, Stage 1"] as Sequence;
TaskHost dLScriptTask = aContainer.Executables["Download Source From SharePoint"] as TaskHost;
dLScriptTask.Disable = true;
}


protected virtual void EnableValidationExecutable(Package thePackage)
{
Sequence aContainer = thePackage.Executables["Adjustments, Stage 1"] as Sequence;
TaskHost validationFlow = aContainer.Executables["Validate Source Data"] as TaskHost;
validationFlow.Disable = false;
}

protected virtual void EnableValidationExecutable(Package thePackage)
{
Sequence aContainer = thePackage.Executables["Adjustments, Stage 1"] as Sequence;
TaskHost validationFlow = aContainer.Executables["Validate Source Data"] as TaskHost;
validationFlow.Disable = false;
}

Another really handy thing to be aware of…

IDTSEvents

I highly recommend you implement this interface and pass it into your packages. Of course, in each event handler in the interface, implement code to send reasonable information to an output stream. Notice the call to thePackage.Execute, way up in the first code snippet… the class that contains that method implements that interface, so I can manipulate (when necessary) how to handle certain events.

Interestingly, I haven’t needed to do anything fancy with that so far, but I can imagine that functionality being very important in future unit tests that I write.

Here’s a visual on all the resources… the image shows SSMS over VS, with both database tables and project resources with common URNs to relate them.

I won’t get into the details of the framework functionality, but I found it useful to be able to do things like set a flag to rebuild baseline resources from current outputs, and such.

I modeled some of my framework (very loosely) functionality on the Visual Studio Team System Edition for Testers, which we used on the TWM ISIS project.

Another interesting lesson learned: I can see that the folks who built SSIS were not avid unit testers themselves. SSIS Executables have a “Validate()” method. I encountered lots of problems when I tried to use it. Hangs, intermittent errors, all that stuff that testing should have ironed out.