The Issue with BizTalk
While working for one of our customers we got a notification from a partner of theirs, that they were experiencing performance issues when calling one of our BizTalk webservices.
For some background, this is a basic webservice where BizTalk exposes a HTTP endpoint through a one-way receive port, when a message is received a HTTP 202 response is sent back to the client and BizTalk processes the received message.
The issue they were facing was that once they had sent in the request, the HTTP 202 response from BizTalk would take about 40 seconds to get back to them, while in the past this used to be nearly instantaneous.
Why was this happening?
This not only happened on their production machines, it also occurred on development, test and acceptance and it happened just about every time the webservice was called. At this point we could rule out heavy load on the BizTalk environment, as this was also happening on environments with no load at all (development for example), so it had to be something in the BizTalk webservice itself. I also ruled out an issue within the BizTalk configuration itself, as we did not have this issue on all other webservices, and all other interfaces were running perfectly fine.
Since this webservice was implemented as a one-way receive port, there was no business logic being executed before sending back the response. The only thing BizTalk executes before sending back the response is the pipeline of the receive port, but in this case the pipeline being used was the out of the box XMLReceive pipeline.
At this point I was reproducing the issue on a development machine, with the same message that was causing the issue on the other environments, and noticed something strange when calling the webservice from Fiddler. When the message was received by BizTalk there would be a HTTP GET call directly after to the URL http://xml.cxml.org/schemas/cXML/1.2.011/cXML.dtd that took some time to complete and once this HTTP GET call was complete BizTalk would return the response to my call.
It turned out that the message contained this DOCTYPE header: <!DOCTYPE cXML SYSTEM “http://xml.cxml.org/schemas/cXML/1.2.011/cXML.dtd”>, and when BizTalk receives a DOCTYPE header it calls the endpoint to retrieve the DTD file. In our case the http GET call to retrieve the DTD file took a long time and caused the performance issue because it was preventing BizTalk returning the response to the client.
Fix & Conclusion
Fortunately for us the fix was very simple, as we could simply set the DtdProcessing property of the Xml Disassembler pipeline component to Ignore (this was introduced in BizTalk 2016 CU6: https://support.microsoft.com/en-us/help/4458162/improvement-configure-dtd-processing-inside-xml-disassembler-pipeline) and the performance issue was solved.
So in conclusion, it is best to always set this property to Ignore and only change this when DTD processing is actually used, as the documentation regarding the DtdProcessing property states: “If no value is specified, DTD will get processed by default”. Always setting it to ignore would prevent any unwelcome surprises regarding performance whenever a message includes a DOCTYPE header.
Notes
- The DtdProcessing property on the Xml Disassembler is only available from BizTalk 2016 CU6 and onwards.
- If you submit a message to a BizTalk receive port using the Xml Disassembler with the DOCTYPE header referring to a very large file (for example a URL to a 1 GB zip file) BizTalk will actually get and download this file leading to a lot of network traffic and potentially some dangerous scenario’s.
- Out of curiosity I tested the same in Azure using Logic Apps and validating the XML message using a schema hosted in an Integration Account. When you include a DOCTYPE header you actually get an error stating: “For security reasons DTD is prohibited in this XML document”, so in that case no need to worry.
Subscribe to our RSS feed