Message correlation between IServiceBehavior data and IEndpointBehavior data

Topics: Technical Questions
Jan 25, 2010 at 4:58 AM

Hi,

I'm trying to implement message tracing end to end. So I can view the messages sent through MSE with the following hierarchy:

Server -> End point -> Operation -> Operation Data (header and body for the request and response)

Implementing IServiceBehavior, I can attach my custom interceptor class through the ApplyDispatchBehavior method. This interceptor class implements the IExtension<ServiceBaseHost> interface, and through the Attach method I subscribe to the messenger events. These events are of the type MessengerEventArgs. Through this class I can this lot of information:

Server -> End point -> Operation

But not the Operation Data. For this I had to implement a IEndpointBehavior with IClientMessageInspector and IDispatchMessageInspector. Through the implementation of these I hook into events such as BeforeSendReply(ref Message reply, object correlationState). The correlation state is of the type of LogMessage, and I can have access to the Operation Data (header and body for the request and response)

My question is, if I want to have a true end to end, I should be able to map all the way from the server down to the operation data for the request or response. But I haven't been able to tie these two objects (MessengerEventArgs data from the IServiceBehavior and LogMessage data from the IEndpointBehavior). I can see that the LogMessage class has an ActivityID, but it's always null. With the MessagengerEventArgs it's not null. How can I get some correlation between these two implementations?

Thanks.

Developer
Jan 25, 2010 at 6:26 PM

I'm not quite sure I understand your whole question.  It sounds like you are using a mix of internal MSE events and part of another logging behavior that ships with the MSE.

Thinking about tracing and correlating traffic going through the MSE, correlating the message flow means observing the incomming message at an MSE endpoint, the outgoing message to the Resource, the response from the Resource, and the Response to the original caller of the MSE endpoint.  From a WCF perspective there are probably a number of ways to do this, but you could peform the correlation with IEndpointBehavior implementations (one at the endpoint, and one at the Resource level.  For simplicity the Resource level implementation could be in a policy associated with a System or System Instance to ensure all Resources have this behavior applied (rather than having to remember to do it for each and every resource)).

The IEndpointBehavior hook gives you access to  message headers as well as the message body.  You could inject a custom message header in the policy at the MSE Endpoint with specific correlation and timestamp information that you could leverage or add to throughout the pipeline.  The MSE automatically injects a custom message header for other tracking purposes that includes an ActivityId you could use for correlation.  There is a behavior that logs messages via the IEndpointBehavior manner if applied in endpoint and resource policies... LoggingBehaviorExtensionElement in Microsoft.MSE.Behaviors.Logging.dll... Though I don't believe it records the ActivityId.... which may be part of the problem you are referring to.

Another approach you could use, is hook into some internal MSE specific events that are fired when certain activities occur within the MSE (as you have partially done with the messenger events).  If you reflect on the BAM behavior implementation (Microsoft.MSE.Behaviors.BAM.dll), you'll see how the hooks can be used together to correlate activity which includes Broker and Channel events to get full message lifecycle correlation.  However, these hooks won't let you see the message body.  The down side is that we don't really publish these hooks, so we may change them.

If you are trying to ensure correlation with message body access, I'd recommend the IEndpointBehavior route.  You can reflect on Microsoft.MSE.Behaviors.Logging for additional inspiration.  But this approach would give you complete control over how you want to correlate activity and/or log messages.

hope that helps.

 

Jan 28, 2010 at 1:19 AM
Edited Jan 28, 2010 at 1:20 AM

Hi Botto,

you were right, I was trying to mix the two behaviours and trying to find a way to join the outputs for reporting. I've gone down the IEndpointBehavior route, I've reflected Microsoft.MSE.Behaviors.Logging and modified the code so that it would track ActivityID, endpointURI. Here's the modified code:

 

        private object LogRequestMessage(ref Message request)
{
LogMessage message = null;
if (this.m_configuration.IsEnabled)
{
message = new LogMessage();
try { if ((this.m_configuration.LogDirection & LogDirection.Request) == LogDirection.Request)
{
string activityID;
string endpointUri;
message.RequestMessageData =
this.GetMessagePartForLog(ref request, out activityID, out endpointUri);
message.ActivityID = activityID;
message.EndpointUri = endpointUri;
message.CreationDateTime = DateTime.Now;
}
}
catch (Exception exception)
{
TraceHelper.TraceError(string.Format("An error ocurred while writing Request message to Log [{0}]",
TraceHelper.GetExceptionDetails(exception)));
}
}
return message;
}

 

Inside GetMessgaePArtForLog I've added the following code:

                    // Start mod
nsmgr.AddNamespace("mse", "http://services.microsoft.com/MSE"); XmlNode node = null;  node = document.DocumentElement.SelectSingleNode("//mse:MSERequestContext/mse:ActivityId", nsmgr); if (node != null) { activityID = node.InnerText; } else { activityID = String.Empty; } node = document.DocumentElement.SelectSingleNode("//mse:MSERequestContext/mse:EndpointURI", nsmgr); if (node != null) { endpointUri = node.InnerText; } else { endpointUri = String.Empty; } // End mod

Now I can relate the outputs from the BAM behavior plus this modded behavior to get to see the data that caused a fault. I've got two questions now:

1. You say the hooks in the BAM behavior are unpublished and are subject to change. I'm not relying on the these hooks now directly, but I'm proposing to use the BAM behavior that internally it does. Is it advisable I use the BAM behavior in a production environment? I'm approaching MS locally and getting the details on the available support arrangements.

2. I'm interested if performance reasons was the reason the code I've added that does the extra querying in the message wasn't included in the published behavior?

Thanks.

Developer
Jan 28, 2010 at 2:04 AM

The logging behavior really should capture the activityid field so that there is better coorelation out of the box (or at least let this be opted into via configuration).  It's great you were able to get that working.

As you point out, there is obviously performance implications by probing into the message and extracting out various pieces.  For really time critical scenarios, you'd probe into the message as few times as possible durring the invocation flow of the service and then do further processing/parsing of the logged information offline.  This is sort of the approach taken by the logging behavior.  If you were to log only the headers, you could then do post processing to read the logged headers and extract the activityid and endpointuri into a database for analysis/reporting.  This would probably have less impact on the service invocation since the logging behavior would only call SelectSingleNode() once.

Since we provided the BAM behavior, if we were to change the underlying hooks in the MSE that it uses, we'd update the BAM behavior to use and benefit from the new hooks.  That being said, the BAM behavior itself could be improved but is a good sample of how to extract metrics for messages flowing through the MSE.  We'd encourage further improvements and tuning of the BAM behavior to suite your individual needs with the knowledge that you may need to revisit those improvements in a future udpate.  I know that wasn't as simple of an answer as you were probably looking for, but I hope it helps.

 

Feb 19, 2010 at 5:16 AM

Hi,

I am investigating solutions for tracing a transaction that daisy chains wcf services. We are manually setting a Transaction ID into the soap header of request/reply message from our first wcf service and hoping it will be passed through to other wcf services in the soap header. The calls to other wcf services from our first wcf services are to a MSE virtualised endpoint and not directly to the wcf service itself.

When we call a wcf service from another wcf service via a MSE virtual endpoint, the Transaction ID set in the soap header of the request is lost. Upon inspecting the request, it appears that MSE is creating a brand new request without the Transaction ID in the soap header.

We have tested a single client call with transaction ID in soap header using the MSE Service Test tool and it appears that to work. But when calling our daisy chain scenario, the Transaction ID is gone from the soap header.

I have the following question.

- Is way to make MSE include our Transaction ID in request soap header for our wcf services daisy chain scenario?

Thanks.