Building a Pub/Sub Message Bus with WCF and MSMQ

In recent years there has been a lot of talk about event-driven architecture as a technique to build more scalable and maintainable systems. I've found this to be a very interesting pattern that makes sense in a number of scenarios, but it's never been very well supported on the Microsoft platform, and many who have attempted it have found it painful. A number of years ago I worked on a system using a pub/sub message bus built on .NET Remoting, MSMQ and HTTP, and it wasn't at all pretty. Everything was difficult and required custom code, from hosting the queue listeners, encoding and decoding messages, dealing with reliability and managing subscriptions.

So it was with some apprehension that I made another attempt to adopt this pattern in my current project. However a lot has changed in the last few years, and I'm pleased to say that my experience was many, many times better than the one I'd been through all those years ago. Before I get on to the solution, I want to make clear that I'm describing just one approach to implementing this pattern, and there are other approaches that may be more appropriate for applications with different requirements. Specifically the application I'm working on is a largely green-field .NET application, so interoperability across platforms was not a consideration (lucky me!).

The solution we ended up with was built with .NET Framework 3.0 and makes extensive use of Windows Communication Foundation (WCF), Microsoft Message Queuing (MSMQ) 4.0 and Internet Information Services (IIS) 7.0, all hosted on Windows Server 2008. Here's what we did.

Defining the Service Contract

The first step was to define the contracts which the publisher would use to notify any subscribers that an interesting event occurred. In our case we had a number of different types of events, but in order to reuse as much code as possible we used a generic service contract:

 [ServiceContract]
public interface IEventNotification<TLog>
{
    [OperationContract(IsOneWay = true)]
    void OnEventOccurred(TLog value);
}    

Now for any given event type, we can simply define a data contract to carry the payload (not shown here), and provide a derived service contract type as shown below:

 [ServiceContract]
public interface IAccountEventNotification : IEventNotification<AccountEventLog>
{
}

Implementing the Publisher

One of the key aspects of a publisher/subscriber pattern is that there should be ultra-loose coupling between the publisher and the subscriber. Critically, the publisher should not know anything about the subscribers, including how many there are or where they live. Originally we tried using MSMQ's PGM multicasting feature to accomplish this - essentially this lets you define a single queue address that will stealthily route the same message to multiple destination queues. While this feature does work, it had a couple of limitations that made it inappropriate in our scenario. First, the only way to use multicast queue addressing with WCF is to use the MsmqIntegrationBinding, which is less flexible than the NetMsmqBinding. Second, multicast addressing only works with non-transactional queues, which would have had an unacceptable impact of the reliability of our system.

So we abandoned this option and decided to implement our own lightweight multicasting directly within the publisher code. While technically this breaches the golden rule of the publisher knowing nothing about the subscribers, the information about the subscribers is completely contained in a configuration file. This means we can add, change or remove subscribers before or after deployment with no impact on the application code.

We had already built a component we called the ServiceFactory (no relation to the p&p Web Service Software Factory) which is a simple abstraction for creating local or WCF instances via a configuration lookup. This component isn't publicly available, but you could easily substitute your favourite Dependency Injection framework and achieve similar results. In our case, the web.config for one of our web services may have its dependent services defined as follows:

<serviceFactory>
    <services>
<add name="EmailUtility" contract="MyProject.IEmailUtility, MyProject" type="MyProject.EmailUtility, MyProject" mode="SameAppDomain" instanceMode="Singleton" enablePolicyInjection="false" />

        <add name="SubsctiberXAccountEventNotification" contract="MyProject.Contracts.IAccountEventNotification, MyProject.Contracts" mode="Wcf" endpoint="SubsctiberXAccountEventNotification" />
<add name="SubsctiberYAccountEventNotification" contract="MyProject.Contracts.IAccountEventNotification, MyProject.Contracts" mode="Wcf" endpoint="SubsctiberYAccountEventNotification" />
    </services>
</serviceFactory>

Previously we had used the ServiceFactory for creating individual instances, with code like this:

 IEmailUtility email = ServiceFactory.GetService<IEmailUtility>();

 

As you can see from the configuration above, this would result in a singleton instance of a local class called EmailUtility being returned, but different configuration could result in a WCF proxy being returned instead. It was a simple matter to reuse this same ServiceFactory component to return all configured services matching a specific contract. We used this capability to build the NotificationPublisher class as follows:

 public class NotificationPublisher<TInterface, TLog>
    where TInterface : class, IEventNotification<TLog>                    
{
    public static void OnEventOccurred(TLog value)
    {
        List<TInterface> subscribers = ServiceFactory.GetAllServices<TInterface>();

        foreach (TInterface subscriber in subscribers)
        {
            subscriber.OnEventOccurred(value);
        }
    }
}

With this class in place, all that is required for the publisher to publish event is to instantiate a NotificationPublisher with the appropriate generic parameters and call the OnEventOccurred method. Assuming we are using the IAccountEventNotification interface and the above configuration, this would result in the event being fired over WCF to the services defined by the SubscriberXAccountEventNotification and SubscriberYAccountEventNotification endpoints.

Configuring the Publisher

The final missing piece on the publisher side is the WCF configuration. As mentioned previously, we chose to use MSMQ to provide reliable, asynchronous message delivery. Programming with MSMQ used to be quite a painful experience, but with WCF the programming model is no different than for any other transport - all you need to do is configure the right bindings. In our case we chose the NetMsmqBinding, which provides full access to WCF functionality for core MSMQ features (as opposed to the MsmqIntegrationBinding, which provides richer MSMQ support at the cost of more limited WCF functionality).

Here's an example of the client-side WCF configuration.

<system.serviceModel>

    <bindings>
        <netMsmqBinding>
            <binding name="TransactionalMsmqBinding" exactlyOnce="true" deadLetterQueue="System" />
        </netMsmqBinding>
    </bindings>

    <client>
        <endpoint name="SubscriberXAccountEventNotification"
            address="net.msmq://localhost/private/SubscriberX/accounteventnotification.svc"
            binding="netMsmqBinding" bindingConfiguration="TransactionalMsmqBinding"
            contract="MyProject.Contracts.IAccountEventNotification" />

<endpoint name="SubscriberYAccountEventNotification"
            address="net.msmq://localhost/private/SubscriberY/accounteventnotification.svc"
            binding="netMsmqBinding" bindingConfiguration="TransactionalMsmqBinding"
            contract="MyProject.Contracts.IAccountEventNotification" />
      </client>
</system.serviceModel>

There's nothing too fancy in this - the key thing to note is the exactlyOnce="true" setting which is required for transactional queues. The other thing that my stand out is the unusual net.msmq:// addressing syntax, which is used by the NetMsmqBinding in lieu of the more familiar FormatName addresses. The queues themselves are private queues called "SubscriberX/accounteventnotification.svc" and "SubscriberY/accounteventnotification.svc". Why did I give the queues such silly names? Read on...

Hosting and Configuring the Subscribers

In the past, if building MSMQ clients was annoying, building MSMQ services was a nightmare. You had to build your own host (typically in an NT Service) or make use of the somewhat inflexible MSMQ Triggers functionality. You then had to do a whole lot of work to ensure your service didn't lose messages, and that it wasn't killed by "poison messages", which are messages that will constantly cause your service to fail due to a malformed payload or problems with the service.

Just like on the client side, WCF takes a lot of the hard work away on the service side - but it doesn't directly help with hosting the service and listening to the queue. Luckily this problem is solved beautifully by IIS 7.0 and Windows Activation Services (WAS), which is available on Windows Vista and Windows Server 2008. In a nutshell this enables IIS to listen to MSMQ, TCP and Named Pipes and activate your WCF service, just as IIS 6.0 does for HTTP. If this all sounds great, it is - but be warned that it can be a bit fiddly to set up.

First, you need to set up an "application" in IIS that points to your service, including the .svc file and the web.config file. This is no different to what you'd normally do for an IIS-hosted service exposed over HTTP.

Next, you need to create the message queue - you can do this with the Computer Management console in Vista or Server Manager in Windows Server 2008. The name of the queue must match the application name plus the .svc file name, for example "SubscriberX/accounteventnotification.svc" (this fact is unfortunately not well documented). Make sure you mark the queue as transactional when you create it, as you can't change this later. You'll also need to set permissions on the queue so that the account running the "Net.Msmq Listener" service (NETWORK SERVICE by default) can receive messages, and whatever account is running the client/publisher can send messages (NETWORK SERVICE by default, too).

Finally you'll need to configure IIS and WAS to enable the Net.Msmq listener for the web site, and for the specific application (make sure you've installed the Windows components for WAS and non-HTTP activation before you proceed!). The easiest way to do this is using appcmd.exe which lives in the \System32\InetSrv folder:

  • appcmd set site "Default Web Site" -+bindings.[protocol='net.msmq',bindingInformation='localhost']
  • appcmd set app "Default Web Site/SubscriberX" /enabledProtocols:net.msmq

With the IIS configuration in place, it's time to make sure the service's WCF configuration is correct. As you might expect, this looks pretty similar to the client configuration you saw earlier.

<system.serviceModel>
    <bindings>
        <netMsmqBinding>
            <binding name="TransactionalMsmqBinding" exactlyOnce="true" deadLetterQueue="System" receiveErrorHandling="Move"/>
        </netMsmqBinding>
    </bindings>

    <services>
        <service name="SubscriberX.NotificationService">
            <endpoint contract="MyProject.Contracts.IAccountEventNotification"
                bindingConfiguration="TransactionalMsmqBinding"
                binding="netMsmqBinding"
                address="net.msmq://localhost/private/SubscriberX/accounteventnotification.svc"/>
        </service>
    </services>
</system.serviceModel>

One thing worth calling out here is the receiveErrorHandling="Move" . This innocent-looking attribute probably saved us a month of work, as it tells WCF to move any messages that have repeatedly failed to be processed onto an MSMQ subqueue called "poison" and continue processing the next message, rather than the faulting the service. Note that subqueues, as well as the long-awaited ability to transactionally read from a remote queue, are some more new features in MSMQ 4.0 in Vista and Windows Server 2008.

Implementing the Subscribers

The only thing remaining is to implement the subscriber. Most of the code will of course be specific to the business requirements, so I'll only spend time describing the implementation of the service interface. In our system it is very important to make sure that no messages are accidentally lost. Since MSMQ can provide guaranteed delivery it may not be obvious how a message could just vanish. In fact most messages are lost after MSMQ has successfully delivered the message to the service. This can happen if the service receives the message and then fails before the message is successfully processed (possibly due to a bug or configuration problem). The best way of avoiding this problem is to use a transaction that spans receiving the message from the queue and any processing business logic. If anything fails, the transaction will be rolled back - including receiving the message from the queue! If the problem was a temporary glitch, the message may be successfully processed again. If the problem is permanent or caused by a malformed message, the message will be considered to be "poison" after several retries, and as mentioned earlier will be moved to a special "posion" subqueue where it can be dealt with manually by an administrator.

Making all of this work is surprisingly simple, since all of these capabilities are supported by MSMQ (provided you're using transactional queues) and WCF. All that you need to do is decorate your service implementation methods with a couple of attributes that state that your business logic should enlist in the transaction started when the message was pulled off the queue.

 public class NotificationService : IAccountEventNotification
{
    [OperationBehavior(TransactionScopeRequired = true, TransactionAutoComplete = true)]
    public void OnEventOccurred(AccountEventLog value)
    {
        // Business-specific logic
    }
}

Conclusion

While this has been one of the longer blog posts I've done in a while, the solution is extremely powerful and surprisingly simple to implement thanks to some great advances in WCF, MSMQ and IIS. In the past, many people (including myself) have spent months trying to implement pub/sub patterns, often with less-than-spectacular results. But using these new technologies eliminates huge amounts of custom code - in fact the few code and configuration snippets in this post are really all that it takes to make this work.

Comments

  • Anonymous
    May 16, 2008
    Tom, Great post. Very thorough. I was curious if you were using the same kind of API's for regular inter-service communication (one way over MSMQ). There's an open source project (mine, I admit it) that provides a framework for the things you described above. It's called nServiceBus. It supports subscribers joining and leaving dynamically without any changes to the publisher (no config changes either), and has a bunch of other functionality as well. I'd be happy to hear any feedback you might have given the experiences you've outlined in your post. The site is <a href="http://www.nServiceBus.com">www.nServiceBus.com</a>. Thanks.

  • Anonymous
    May 17, 2008
    The comment has been removed

  • Anonymous
    May 17, 2008
    Hi Udi - Thanks for sending the link - I'll take a look at nServiceBus - looks very interesting from a quick glance. The pub/sub framework I worked on a few years ago actually included dynamic subscription management. However for both that system and my current project I believe this feature is overkill - new subscribers are only going to appear after careful consideration and testing, and when this happens it's not at all taxing to update a couple of config files. That said I realise some environments are more dynamic, so dynamic subscription management does have its place. On my current project we're using the MSMQ pub/sub for some calls, and synchronous calls (currently using HTTP, but likely to be TCP by the time we deploy) for others. The key consideration is whether the service's response needs to be relayed to the end user immediately or not - we're trying to make as much processing asynchronous as possible but in many cases we found that it just isn't practical. Tom

  • Anonymous
    May 18, 2008
    The comment has been removed

  • Anonymous
    May 18, 2008
    Hi Udi - On the server side, we're not using any custom APIs - it's just a standard WCF service interface so the programming model is no different for "message bus" services or synchronous services. I'm not sure I understand how you're using one-way services across the board. In situations where the client requires an immediate response (say, to get details of a particular order to show on the UI), synchronous processing seems way simpler. While you can always pair one-way messages into a duplex communication, this creates the need for live endpoints and message correlation on the client side. Am I missing something, or are you talking about a different scenario? Tom

  • Anonymous
    May 19, 2008
    P&amp;P/Design Patterns I have been in a customer situation where I have been using a combination of

  • Anonymous
    May 20, 2008
    Very interesting. There is another interesting pub/sub implementation on http:/www.distributethis.com that is also WCF based.

  • Anonymous
    June 13, 2008
    本文是翻译Tom Hollander先生的blog文章《Building a Pub/Sub Message Bus with WCF and MSMQ》,对英文blog文章感兴趣的朋友,可以直接访问文章下面的链接。-译者:EntLib.com开源论坛小组。

  • Anonymous
    June 13, 2008
    基于 WCF 和 MSMQ 构建发布 / 订阅消息总线( Pub/Sub Message Bus ) 本文是翻译Tom Hollander先生的blog文章《Building a Pub/Sub Message

  • Anonymous
    June 13, 2008
    If you want to get a Chinese version of this post, please access the following link. http://forum.entlib.com/Default.aspx?g=posts&t=57


如果您想看看这篇post的中文版本,请您访问如下地址(翻译): http://forum.entlib.com/Default.aspx?g=posts&t=57

  • Anonymous
    June 14, 2008
    This is a well written article, clear and informative. Today there many Services that are being deployed over the internet. The new Age of services is in shape of "Software as a service". In the Message Queue arena you can find Amazon Web Service or OnlineMQ.com. In the integration arena you can find CapeClear....and much more.

  • Anonymous
    June 24, 2008
    @Tom, regarding http:/www.distributethis.com Where exactly is that pub/sub implementation? I could not find it on the web page - am I blind? Thanks.

  • Anonymous
    July 11, 2008
    A few weeks ago I posted an article describing how my current team built a publish/subscribe message

  • Anonymous
    July 11, 2008
    A few weeks ago I posted an article describing how my current team built a publish/subscribe message

  • Anonymous
    July 12, 2008
    .NET/C#/Functional Programming The very useful CR_Documentor 2.0 has been released with Sandcastle Preview

  • Anonymous
    September 27, 2008
    AfewweeksagoIpostedanarticledescribinghowmycurrentteambuiltapublish/subscribemessage...

  • Anonymous
    December 02, 2008
    .NET/C#/Functional Programming The very useful CR_Documentor 2.0 has been released with Sandcastle Preview and is now open source code Mathew has a boatload of links, resources, slide deck and code in Aspects of Functional Programming in C# Presentation