Thursday, July 28, 2011

BizTalk ESB Toolkit 2.1 Exception Handling bits


A few bits & pieces gleaned from using the Microsoft BizTalk ESB Toolkit 2.1 to provide a standard exception handling framework for BizTalk 2010.

ESB Fault Message Infinite Loop (CPU 100%)

There are certain conditions under which creating an instance of the ESB Fault message using a call similar to the following will cause your orchestration to enter an infinite loop, and your server's CPU to hit 100%:

faultMsg = Microsoft.Practices.ESB.ExceptionHandling.ExceptionMgmt.CreateFaultMessage();

While the call is perfectly valid, there is a bug in the ESB framework that under certain conditions will cause the CreateFaultMessage method to enter an infinite loop. The conditions are either:
  1. You call CreateFaultMessage outside an exception handling block inside a Scope shape.
  2. You call CreateFaultMessage after catching an exception that derives from Microsoft.XLANGs.BaseTypes.XLANGsException.
(1) means that you can't use CreateFaultMessage to create an instance of the ESB fault message schema outside an exception handling block. You can work around this by defining and throwing your own custom exception at the point where you would otherwise have called CreateFaultMessage, and then leave it to an exception handling block that catches your custom exception to call CreateFaultMessage and perform your exception handling pattern... I think this is probably a pretty good pattern anyway.

(2) means that you have to be careful with what you catch and handle in your exception handling block, and if it derives from Microsoft.XLANGs.BaseTypes.XLANGsException, don't call CreateFaultMessage.

The following post has some suggestions for how to rectify this bug in the source: http://www.bizbert.com/bizbert/2011/05/06/Improving+The+ESB+Toolkit+Fixing+The+Endless+Loop+Bug+When+Creating+Fault+Messages.aspx

Creating Custom Exceptions for use with ESB Toolkit

If you decide to head down the path of defining and throwing your own custom exceptions for use with the ESB exception management framework, you need to follow certain rules in the custom exceptions:
  1. Decorate your class with SerializableAttribute.
  2. Inherit from System.Exception.
  3. Define a protected deserialization constructor.
For example:

[Serializable]
public class MyException : System.Exception{
    internal MyException() : base() { }
    internal MyException(string message) : base(message) { }
    protected MyException(SerializationInfo info, StreamingContext context) : base(info, context) { }
}

Also note if you define any custom properties in your custom exception, these will need to be catered for in the deserialization constructor and by overriding the GetObjectData method, for example:

[SecurityPermissionAttribute(SecurityAction.Demand, SerializationFormatter = true)]
protected MyException(SerializationInfo info, StreamingContext context) : base(info, context)
{
  this.mScope = info.GetString("Scope");
}


[SecurityPermissionAttribute(SecurityAction.Demand, SerializationFormatter = true)]
public override void GetObjectData(SerializationInfo info, StreamingContext context)
{
  if (info == null)
  {
    throw new ArgumentNullException("info");
  }
  info.AddValue("Scope", this.Scope);
  base.GetObjectData(info, context);
}


Beware Null Values in Fault Message Properties

After you've created your instance of the ESB fault message using CreateFaultMessage, you would normally set properties of the message using its distinguished fields. Just beware settings any of these values to a null value - this causes the serialization of the message to fail. I usually use some sort of helper function that checks if the value that will be populated into the property is null and uses a default value if it is, for example:

faultMessage.Body.FailureCategory = MyExceptionManager.EsbPropertyProvider.GetFailureCategory(caughtException, MyExceptionManager.FailureCategories.Default);

Writing Exception Details to the Windows Event Log

Lastly, a rather obscure one, but the ESB framework provides a helper function for writing exception details to the Windows Application event log. You need to add a reference to Microsoft.Practices.ESB.Exception.Management.dll, then in an expression shape you can use:

Microsoft.Practices.ESB.Exception.Management.EventLogger.LogMessage(
  exceptionToHandle.Message,
  System.Diagnostics.EventLogEntryType.Error,
  (System.Int32)Microsoft.Practices.ESB.Exception.Management.EventLogger.EventId.Default);


HTH!

Thursday, July 21, 2011

Issue opening orchestrations in Visual Studio 2010

This was something that had cropped up now and then when designing BizTalk orchestrations in Visual Studio 2010...

Once the orchestration had been opened in the "source" view (to edit the raw XML), from that point onwards Visual Studio 2010 would open the orchestration in text view all the time... The workaround was to use "Open With" and choose the designer...

It wasn't until I came across this post by Randal van Splunteren that I discovered a permanent way to fix the issue: Edit the .btproj file in Notepad and remove the <SubType>Designer</SubType> tags associated with each orchestration that suffers from the issue.

Thanks Randal!

Wednesday, July 20, 2011

WCF, Enterprise Library & Cruise Control

[Note: This post is based upon an old blog post that I'm migrating for reference purposes, so some of the content might be a bit out of date. Still, hopefully it might help someone sometime...]

I recently had an interesting experience with Cruise Control automated builds. The scenario was this:
  • A set of web services implemented in WCF, with the constituent parts separated out into distinct projects in the Visual Studio solution: Common, Contracts, Implementation, Host.
  • The Exception Handling Application Block (EHAB) from the Microsoft Enterprise Library 4.1 was used for its great configurable exception handling & logging and exception shielding features.
  • We had automated builds set up on a build server using Cruise Control. Anytime you checked in changes to the solution, the build server would rebuild a new version and make it available for deployment to the "official" dev, test, and production environments. [This has since been changed, because it was kind of overkill and led to about a million (overstating) builds per day]
I was working on a local dev machine, modifying the web services, running them locally, and performing my unit tests successfully. Checked in my changes and asked for the latest build to be deployed to the "dev" web server, ran the same unit tests, the unit tests ran fine until I executed a test that was intended to produce an error that needed to be handled by the EHAB configuration. Instead of the logging and custom Fault I was expecting, I got the default "shielded" Fault and exception message produced by the EHAB "An error has occurred while consuming this service. Please contact your administrator for more information. Error ID: {handlingInstanceID}".

Hmm... I double-checked that I was indeed sending in exactly the same "bad" request, that should be generating the exception I was expecting, and that the EHAB configuration should be handling it. Yes indeed.

To cut a long story of investigation and frustration short, it came down to a "clash" between the automated build in Cruise Control and my EHAB configuration.

My EHAB configuration was attempting to transform from a particular Exception type to a particular (custom) Fault Contract type using the Fault Contract Exception Handler. The EHAB configuration was referring to the Fault Contract type using the fully qualified strong name of the assembly, including the version number.

Now here's where Cruise Control was coming into the picture. The "standard" Cruise Control build script was, prior to build, performing a substitution within any AssemblyInfo files it was building, to replace the text "1.0.0.0" with the current Cruise Control build number. In my case, as this was a new solution, I hadn't changed the AssemblyInfo version numbers from their defaults of 1.0.0.0, and hence when my solution was built by Cruise Control, the assemblies ended up with the Cruise Control-generated version numbers. Of course, this led to EHAB looking for the assembly within which my Fault Contracts were located with a particular version (1.0.0.0), and the actual assembly that was deployed had a version number nothing like this.

Although I kind of objected to the rather agricultural textual "find-and-replace" in the Cruise Control build script, I wasn't in a position to be able to change it. The solution ended up being to modify my EHAB configuration to include the "short" version of the Fault Contract & Assembly name, rather than the "long" version (I'm sure there are official names for these). So, instead of something that looked like this:

XYZ.Service.Contract.ServiceOperationFault, XYZ.Service.Contract, Version=1.0.0.0, Culture=neutral, PublicKeyToken=...

I replaced it with something like this:

XYZ.Service.Contract.ServiceOperationFault, XYZ.Service.Contract

This works fine in my case because the assembly is deployed alongside everything else, in the bin folder for the web services.

Concatenate values in a SELECT statement

I was digging through some old (very old) notes I had on SQL Server 7.0 and came across this one and thought I'd post it for my own reference... (updated a bit).

To concatenate the values of a particular column in a SELECT statement, do something like this:

DECLARE @technicianNames VARCHAR(max)
SET @technicianNames = ''

SELECT @technicianNames = @technicianNames + t.TechnicianName + ','
  FROM dbo.Technician t
 ORDER BY t.TechnicianName ASC

IF Len(@technicianNames) > 0
BEGIN
  SET @technicianNames = Left(@technicianNames, Len(@technicianNames )-1)
END

Say you had the following values in the Technican table:

TechnicianName
--------------
Dave
Trevor
Agnes

The value of @technicianNames would be "Dave,Trevor,Agnes".

Hopefully it's useful to someone else too... There may well be a better way to do this in the post SQL Server 7.0 world, if there is, please let me know!

Tuesday, July 5, 2011

Not so RelativeSearchPath

[Note: This post is based upon an old blog post that I'm migrating for reference purposes, so some of the content might be a bit out of date. Still, hopefully it might help someone sometime...]

Another .NET version compatibility issue encountered working with the same third party API as described in my earlier post When is String.Empty!= String.Empty?

Under certain conditions when calling this third party .NET 1.1 API from .NET 3.5 we were receiving an exception "Invalid directory on URL". Fortunately the stack trace included enough information for me to whip out my best friend Reflector to reflect inside the API code to see what was going on.

The exception occurred when the API was trying to dynamically load another DLL using Activator.CreateInstanceFrom. In particular, it constructed the path to the DLL using the following:

AppDomain.CurrentDomain.BaseDirectory + AppDomain.CurrentDomain.RelativeSearchPath

This seems to be a fairly common practice where this type of dynamic loading is required, and you need to construct the path at run-time. Unfortunately most of the examples on the web that use this approach use exactly this approach, string concatenation, to construct the path, and don't construct it (or check that it's valid) using System.IO.Path (another of my favourite friends).

When I checked out what the result of the AppDomain.CurrentDomain.BaseDirectory + AppDomain.CurrentDomain.RelativeSearchPath line was, I was somewhat bemused: It was something of the form "c:\projects\webapp\c:\projects\webapp\bin\" (with names changed to protect the innocent).

Huh? Surely that couldn't be right, otherwise it would never have worked!

I whipped up a simple ASP.NET 3.5 web app and examined the values of the two properties used to construct the path, and sure enough they were "c:\projects\webapp\"  and "c:\projects\webapp\bin\" respectively. By this stage, I was assuming that the .NET 1.1 API was expecting RelativeSearchPath to be simply "bin\"...

So, next stop, whip up a simple ASP.NET 1.1 web app and check out the values for the two properties... Hmm, interesting: as the API expected, they were "c:\projects\webapp\" and "bin\" respectively...

So, it would seem that under ASP.NET 2.0+, when the AppDomain is initialised for your web app, the RelativeSearchPath is actually evaluated to the complete physical path to the web app's bin folder... Yay... not so "relative"...

My work-around in this case (as I can't change the third-party API) is to change the AppDomain's RelativeSearchPath just before the call to the API to be "bin\", and just afterwards to be whatever it was before the call... Not pretty, but it works. What I'd really like to understand is why under ASP.NET 2.0+ it's not relative! My suspicion is that it may be initialised to "~/bin" by ASP.NET when the AppDomain starts, and is somehow evaluated to the absolute path as a result of the inclusion of the "~", but I can't be sure...

Anyway, thanks for listening...

BizTalk backups to network share

Over the last few months, as well as doing actual development work, I've been assisting a client build and configure a set of BizTalk environments to cater for everything from development, system integration testing, user acceptance testing, training, pre-production, production and disaster recovery.

One of the final steps in our build for certain environments has been to configure the BizTalk backup SQL job to regularly backup the BizTalk databases to a network share. Not only is this good practice, but it's a mandatory part of setting up BizTalk log shipping as part of a DR capability.

We created a hidden ($) network share, and assigned "Full Control" permissions to a specially-created "BizTalk_Backups" Active Directory group - both at the share level and the filesystem level. We then placed the service account used to execute the SQL job in this group.

However, when it came to executing the BizTalk backup job, we encountered an "Access denied" type error: "BackupDiskFile::CreateMedia: Backup device '...' failed to create. Operating system error 5(failed to retrieve text for this error. Reason: 15105)."

We double-checked the permissions we'd configured, re-created the share not hidden, even checked whether the same service account could write to a different share on the same server... Nothing succeeded.

What did work however was adding the "Everyone" group with "Full Control" permissions on the share and filesystem... but hang on, the SQL service account was a member of the "BizTalk_Backups" group which already had "Full Control" permissions to the share etc... Hmmm... So, we removed "Everyone", and explicitly added the SQL service account with "Full Control" permissions to the share and filesystem... and it worked!

We're still not sure why exactly, but it seems as though the account needed to be added explicitly, rather than via membership in a group... well, other than the "Everyone" group... So problem solved, but a mystery nonetheless. Interested to hear if anyone else has had a similar experience.

UPDATE: I was speaking recently with a colleague of mine who suggested the issue may have been a result of not having restarted the SQL Server and SQL Agent services after we'd added the SQL service account to the "BizTalk_Backups" group. These services may have been caching the group membership of the SQL service account - and a restart of the service may have caused this to be refreshed. I haven't had a chance to check this out, but it sounds plausible.