Multitenant STS and Token Validation
Writing this blog post or dowloading Adera Episode 3? Why do I even ask, I already know the answer :-)
Today I want to tell about one of the patterns behind Windows Azure Active Directory. I call it the multitenant STS pattern, which is a pretty descriptive name IMO and seem to have caught up internally, but as usual (and as everything discussed here) that’s not an official denomination. Also, I want to introduce you to the ValidatingIssuerNameRegistry, one new WIF component we introduced to (quietly!) help you to better take advantage of this new construct.
Ceci N’est Pas un Federation Provider
If you followed the Windows Azure Active Directory developer preview epopee so far, you already know that among its many great features there is the ability of supporting multi-tenant applications. As shown in the tutorial here, you can easily offer access to the same SaaS application to multiple directory tenants. That comes with the standard laundry list of things you need to take care of when developing multitenant applications: home realm discovery, tenant isolation enforcement, provisioning, and similar (remember FabrikamShipping? I am still super-proud of it :-)).
For many already versed in the claims-based arts, that might look like a sure tell sign that Windows Azure Active Directory is, or at least includes, an STS in the federation provider role.
You’ll be surprised to read that in fact that is not the case. To be more precise: yes, a federation provider does come into play: but that is NOT what allows your application to leverage Windows Azure Active Directory to handle authentication from multiple organizations. What allows that is the multitenant nature of Windows Azure AD itself.
Federation Provider
By now a lot of you will be wondering: what is this “federation provider” thingy he is blabbering about? Without rewriting the many good explanations that abound on the internet, here there’s a quick primer (with some oversimplification for the sake of readability). If you already know what a FP is, please jump to the next section (without collecting the $200).
You are already familiar with the notion of identity provider, or IP, or IdP: it is one entity that knows about a certain user population, has the ability of authenticating users and does so via well-known protocols. The main tool used by the IP to exercise its prerogative is the Security Token Service, or STS: in this specific case, an STS is an endpoint that is capable of receiving user credentials, validating them, look up user attributes and packaging them in a security token (SAML, JWT… remember those?).
Say that a developer wants to restrict access to a certain app to the user population controlled by the IP. The developer will configure the application to trust the IP, which means that the app will rely (hence ‘RP’) on the IP to authenticate users and faithfully represent the outcome in a security token. Wow, I can’t believe how long it has been since I explained this :-) I guess I didn’t realize how ubiquitous claims-based identity became in the last few years. Amazing. Anyway: picture.
Now, imagine that your app wants to work with many different user populations: that means establishing trust with many IPs. That can be onerous, given that IPs might all be very different from each other and require different logic for integration; furthermore, onboarding of new ones would likely disrupt operations and require major overhauling. In other words, in the general case that’s something you’d be often happy to offload somewhere else. Well, as it turns out, that’s easy.
The trust establishment process described for the IP can be iterated. An STS A can be considered an app in itself; and as such, it can trust another IP (or, more precisely, its STS B). So now you’d have an application which offloads authentication to STS A, and STS A in itself would offload authentication to STS B. But that’s great! Now you can have STS A also trust STS C, D and E without really having to touch the application’s code. STS A is an intermediary which handles trust relationships: when an entity plays that role, we say that it is a federation provider (FP). That’s largely because originally the various A, B, C represented different organizations, and a trust between organizations is a federation (though once we have an STS that can act as an intermediary we can apply its capabilities also in situations where there are no organizational boundaries).
Examples of STSes with FP capabilities abound. ACS2.0 and ADFS v2 are the first that come to mind. For example, by trusting an ACS2.0 namespace your app just needs to deal with ACS integration: only one endpoint to connect to, only one key for verifying incoming tokens, and so on; ACS takes care of contacting Facebook, Google, Microsoft Account, Yahoo!, OpenID and ADFS2 instances bearing all the brunt of speaking the different protocols they require.
The custom of using an FP façade when there are multiple STSes is so common that, out of the box, applications using WIF expect to trust only one STS at a time.
Multitenant STS
As it exists today, Windows Azure AD is not really a FP for your applications. If anything, it is an IP; or rather, it defines an IPs space. Too abstract? Keep reading.
Windows Azure AD offers to organizations the infrastructure for running their own IP within it. Tenants are modeled after a template which represents a generic identity provider, with parameterized endpoints for its various protocol heads. The STS endpoint corresponding to a certain organization is obtained by instantiating its tenant ID in the parameter of the endpoint template. The issuing infrastructure is shared, as you would expect from a multitenant system, but from the protocol perspective there are as many IPs as there are tenants in Windows Azure AD.
For example: say that the template for the WS-Federation endpoint is https://accounts.accesscontrol.windows.net/[[TENANTID]]/v2/wsfederation. Say that the tenant ID of TreyResearch is 929bfe53-8d2d-4d9e-a94d-dd3c121183b4. An application that wants to authenticate users from TreyResearch will have to send signin messages to https://accounts.accesscontrol.windows.net/929bfe53-8d2d-4d9e-a94d-dd3c121183b4/v2/wsfederation.
Want to obtain the STS endpoint of Fabrikam? Find the tenantID, instantiate it in the template, and you got it. Ah, and finding the tenant ID is not hard: you can simply look in the entityID of the metadata document, also in form of endpoint template but admitting the more human-friendly domain name (e.g. https://accounts.accesscontrol.windows.net/treyresearch1.onmicrosoft.com/FederationMetadata/2007-06/FederationMetadata.xml).
Validating Tokens from a Multitenant STS
The multitenant STS pattern is a clever way of offering to organizations their very own IP instance, to manage at their whim, while still leveraging a shared, consistent infrastructure.
One aspect of the Windows Azure AD STS infrastructure is that it uses the same certificate for every tenant. Tokens from Windows Azure AD are all signed with the same certificate, a bit like every bank check from a WoodGrove Bank booklet are all printed one the same hard-to-falsify patterned sheets.
Per the same analogy: when you receive a bank check as payment I am sure you’ll find reassuring to confirm that it is not printed on a crumpled post-it. However, that’s not the main factor that makes you decide that the check is good: you normally associate the validity of the check to the entity that is writing it for you. Somebody giving you the impression of being a con artist would probably not convince you to accept a check, even if it’s written on a mint condition official booklet page.
What does it mean in term of tokens? It means that just because a token is coming from Windows Azure AD does not mean that your app should accept it. You should accept it if it comes form Windows Azure AD AND it has been issued by the tenant you are in business with.
And how do you verify it? Simple. You need to check that the thumbprint of the certificate used to sign the token corresponds to the one you saved in your config; AND it means that you have to verify that the <Issuer> element (for SAML; substitute with element of equivalent semantic for other formats) corresponds to the value you saved at trust establishment time, which typically means the entityID read from the metadata and containing the ID of the specific tenant you want.
There’s more! Say that you are selling access to your application to multiple Windows Azure AD tenants. You still want to validate the same thumbprint, but now you want to be able to look up the incoming <Issuer> value in a list of multiple candidates, based on the IDs of all the tenants you are in business with.
In the WIF classes you find out of the box in .NET 4.5. verifications of that kind are normally performed by the IssuerNameRegistry; and specifically, by its concrete implementation ConfigurationBasedIssuerNameRegistry. However ConfigurationBasedIssuerNameRegistry does not actually allow you to specify multiple issuer names per one thumbprint. Furthermore, ConfigurationBasedIssuerNameRegistry does not validate the incoming <issuer> element; the Name attribute in the config element is just an alias for the IP using the corresponding certificate, as established at trust establishment time. Its only effect is to determine the value of the Issuer property of the claims in the ClaimsPrincipal.
Introducing ValidatingIssuerNameRegistry
Well, well, well. Today we released a new NuGet containing an assembly with the class ValidatingIssuerNameRegistry, a new IssuerNameRegistry implementation which does validate incoming <Issuers>s and offers a new configuration element schema to accommodate multiple issuer names. ValidatingIssuerNameRegistry also comes with a nice extensibility model, which allows you to plug your own validation logic if, for example, your list of trusted tenants grows and shrinks dynamically hence better stored outside of web.config.
If you studied the multitenant application tutorial, you know that there we solved the issue by providing a custom handler for SAML tokens; however the shortcoming of that solution is that it is scoped to only that token type, which is one of main reasons for which we opted for an IssuerNameRegistry implementation for a more strategic solution.
Here there’s how a basic ConfigurationBasedIssuerNameRegistry config entry maps to the new ValidatingIssuerNameRegistry format.
The old one:
<issuerNameRegistry
type="System.IdentityModel.Tokens.ConfigurationBasedIssuerNameRegistry, System.IdentityModel, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089">
<trustedIssuers>
<add thumbprint="3464C5BDD2BE7F2B6112E2F08E9C0024E33D9FE0"
name="https://sts.windows.net/26115327-57c9-4bc0-bcc4-3c12bf507377/" />
</trustedIssuers>
</issuerNameRegistry>
And the new one:
<issuerNameRegistry
type='System.IdentityModel.Tokens.ValidatingIssuerNameRegistry,
System.IdentityModel.Tokens.ValidatingIssuerNameRegistry'>
<authority name='aad'>
<keys>
<add thumbprint='3464C5BDD2BE7F2B6112E2F08E9C0024E33D9FE0'/>
</keys>
<validIssuers>
<add name='https://sts.windows.net/26115327-57c9-4bc0-bcc4-3c12bf507377/'/>
</validIssuers>
</authority>
</issuerNameRegistry>
We now have a new construct, authority. An authority can have multiple issuerValues, to meet the requirements described earlier. Although the structure suggests it can also have multiple keys, in the case of Web SSO there will always be just one. Easy, right? In some future post (or at the next revamp of the multitenant app tutorial) I’ll discuss how to write your custom extension for reading issuers from external storage.
As we update our tools, we will include logic to automatically generate ValidatingIssuerNameRegistry config elements and add a reference to the corresponding NuGet in your projects. But even before that, if you work with Windows Azure AD it is important that you use the ValidatingIssuerNameRegistry instead of the in-box ConfigurationBasedIssuerNameRegistry. As you can see, that’s actually pretty straightforward! :-)