Sunday, August 15, 2010

My Love / Hate Relationship with Microsoft ESB Guidance 1.0

The ESB Guidance 1.0 for BizTalk Server 2006 R2 has been around for a while (in fact it's now been replaced by version 2.0 for BizTalk 2009). Not only does it offer an Enterprise Service Bus framework on top of BizTalk, but it also includes a very comprehensive framework for managing exceptions encountered by BizTalk orchestrations and messaging. 

I've had a few encounters with version 1.0, some of them good, some of them... not so good.

Encounter 1

My first time was as part of a POC for a client who I'd recently implemented a set of BizTalk 2006 R2 environments for, and they were looking to leverage a standard framework for exception handling. I'd been reading up on the ESB Guidance 1.0 at the time, and thought it could be just the thing to fill the gap.

The results of the POC were very disappointing.

Issues encountered. During the 3 days of the POC, 4 reasonably significant issues were encountered with the ESB Guidance code. These included:
  • Installation on Windows XP was not straight-forward and the install scripts utilised Windows Server features. I know a Windows XP development environment for BizTalk is less than ideal (trust me, I know), but this client had a tight restriction on what OSes could be used for what purposes, and we had to use Windows XP.
  • The pre-built ESB Guidance binaries signed with the Microsoft key incorrectly referenced a component using the source key (issue logged with Microsoft). This effectively means we couldn't use the pre-built binaries, and had to recompile and deploy based on the source code (something the client wasn't overly keen on doing - maintaining their "own" version of the source).
  • The ESB Management Console assumed UTC time offsets to be integer values (we’re +9.5h).
  • The date/time of exceptions submitted to the database used MM/dd/yyyy format.
Deprecation by Microsoft. The ESB Guidance 1.0 is based on BizTalk Server 2006 R2. With the release of BizTalk Server 2009, Microsoft released version 2.0 of the Guidance, named ESB Toolkit 2.0. They also announced plans for the rapid deprecation of ESB Guidance 1.0 (in fact it's gone from Codeplex).

Unproven feature: Resubmission. Resubmission through WCF ESB itinerary-based receive locations and HTTP-based receive locations was unsuccessful.

Requirement for Dundas Charts. The ESB Management Console utilises a third-party ASP.NET charting control package called Dundas Charts, which we were able to download a trial version of, but which the client was unwilling to buy licenses for.

Encounter 2

Following my first encounter I was most disheartened. However, as it had occurred many months ago, on a more recent engagement I decided to revisit the Guidance, again with a focus on using it for exception handling (yea, I know, why use all that other great stuff in the box?)

This time however I decided to just install the exception handling components from the separate MSI, not the "install everything" MSI. And much to my surprise, this MSI didn't suffer from some of the problems I'd encountered previously.

So, despite its deprecation, the client decided to implement its BizTalk exception handling strategy based on the ESB Guidance components. And it worked great! We had email notifications going to specific mailing lists when key exceptions occurred, we had all exceptions being logged to the ESB faults database... It was just what had been missing. Until...

Encounter 3

This encounter just happened (this week in fact), and it's still ongoing. I was assisting another developer implementing the ESB exception handling components in an existing  BizTalk application, and to test that the framework was working, we decided one of the easiest ways would be to simply "turn off" an endpoint that a Send Port was targeting. Our orchestration was suspended, not as we expected from our own Suspend shape in our exception handler (post ESB), but instead because of a failure in the ESB exception handling components themselves.

Fortunately BizTalk logged the details to the Windows event log when it suspended the orchestration instance, with an error message along the lines of:
Inner exception: Error 115001: An unexpected error occurred while attempting to create the ESB Fault Message.

Exception type: CreateFaultMessageException
Source: Microsoft.Practices.ESB.ExceptionHandling
Target Site: Microsoft.XLANGs.BaseTypes.XLANGMessage CreateFaultMessage()
Additional error information: An error occurred while parsing EntityName. Line 6, position 106.
The exception coming back from the "missing" endpoint was actually an EndpointNotFoundException, but for some reason the ESB exception handling components were struggling to create the initial FaultMessage in our expression shape in the scope exception handler.

Fortunately the ESB Guidance 1.0 also came with source code (it would have been close to useless without the source for all the bugs that needed to be fixed otherwise), so I was able to dig through the source for the CreateFaultMessage method and see what it was doing. Nothing really leapt out at me, but I could trace the exception to its attempt to create the initial FaultMessage from some template XML it was loading from a resource file. Something that was getting injected into one of the placeholders in this template was causing the exception, which given the "An error occurred while parsing EntityName" part, appeared to be related to XML content.

I reconstructed each of the values that were substituted and slowly built up the XmlDocument until it broke... When I passed in the value for the placeholder in the element, from the exception's Message property. For whatever reason, for this particular EndpointNotFoundException (don't know if it's something that BizTalk does or we were just lucky, because I haven't seen this behaviour for "standard" EndpointNotFoundExceptions), the Message property had the full stack trace in it as well as the actual exception message... And of course the stack trace included unescaped "&" characters, which need to be escaped as "&" in XML. The ESB component wasn't doing this, hence the issue loading the XmlDocument object.

That seems pretty dumb, I thought. So I checked out the corresponding class in the ESB Toolkit 2.0 (via Reflector), and sure enough, it comes with a handy "CleanForXml" method that the exception's Message property gets passed through to escape XML reserved characters. So obviously someone noticed at some point and this has been fixed in version 2.0.

Anyway, now we're left in a bit of a dilemma... do we (a) fix the ESB 1.0 source code, but have to maintain our "own" version of the source, (b) stop using ESB 1.0 for exception handling and go back to the dark days of, well, suspended service instance mania, or (c) wait until the client upgrades to BizTalk 2009+ so we can use version 2.0 instead... [ignoring (d) cross our fingers and hope that there's never an XML reserved character in an exception message].

I'll let you know what we decide, but this has once again soured my temporarily restored faith...

No comments:

Post a Comment