Gaining experience with encryption and key rotation
This year I’ve had the privilege of expanding some of my skill set in a field which I find fascinating, but also which I find I am woefully under-qualified to work in: Encryption.
I like encryption. I’ve liked ever since my 4th year in university when I learned all about the mathematical properties behind it, how symmetric and asymmetric ciphers work, and how digital signatures work.
I feel pretty good about my abilities in the basic of encryption.
What I don’t feel good about is key management.
I find that a lot of discussions about encryption do a good job of explaining how encryption works and about the algorithms, and that’s exactly the stuff I already know. What I find they don’t talk about is key management – what does a key lifecycle look? How do you rotate keys? How do you store keys securely? How do you deploy keys securely?
For example, when I was practicing my own encryption for an app I was working on, all I did to encrypt stuff was store the key in a file on my computer, and then store the same key in the remote server in another file and set the permissions properly. I trust myself; if I want a new key, I just create one and copy/paste it.
That’s fine for myself, but what about real life? Large companies need ways to deal with encryption keys beyond copying and pasting them to files. Keys are supposed to be kept away from developers and only Operations can access them. So how do developers create and deploy keys to being with?
I’m not saying that there aren’t books that explain this. I’m sure there are, but most books are light on the details of the key lifecycle.
Fortunately, this year I have been working on three separate features that use encryption and require a key lifecycle, so I have been gaining a little bit of experience. Plus, I was involved in a fourth separate key rotation which I’ll explain below.
First, the features I am working on.
1. Boomerang
This is a feature that inserts a cryptographic tag into an outbound email message. This requires a secret key on the mail servers that must be periodically updated.Boomerang does not perform encryption. Instead, it generates a one-way hash and recalculates it when required to verify data integrity.
2. Time-of-Click URL protection
Time-of-Click URL protection uses a symmetric encryption key. The feature is not yet available but the way it works is that important components of the message are encrypted into a rewritten URL. This is done so an administrator can later search logs to see if someone clicked on a malicious URL.This feature requires encryption keys deployed in two places – on the mail servers (the same as Boomerang) and also on the web servers (to decrypt and validate data).
The encryption keys similarly require a key lifecycle and must be periodically rotated.
Both of these features use a single key across all customers. They keys are not stored in clear-text; instead, we make use of the Windows Data Protection API (DPAPI) to securely store the keys in a central location before deployment to all the mail servers, and then securely roll them out everywhere. They are then protected using the DPAPI.
In other words, there is little resemblance to my copy/pasting a secure key on my own machine where I can view it whenever I want in clear-text. That is not possible here.
3. Outbound DKIM
Outbound DKIM is much different than either Boomerang or Time-of-Click protection.
DKIM requires customers to upload their own private DKIM keys with Office 365. During mail flow, we need the DKIM keys in clear text upon which to affix a digital signature.
Rather than storing all of the keys on the mail servers, DKIM keys are stored in a key vault. Office 365 does not store the clear-text keys, it only stores the key IDs and encrypted keys (in fact, the clear-text keys are never stored anywhere). During mail flow, a secure call (using an authentication token) is made to the key vault to decrypt the private DKIM key so it can sign the message. After it finishes signing, the key is discarded.
Outbound DKIM is both a little simpler and a little more complex than either of the previous two features. Key rotation is required to protect the private DKIM keys, rather than to sign new data. It also requires management of the authentication token.
Finally, I had the opportunity to perform a real life key rotation earlier this year.
4. DKIM key rotation for Microsoft
Microsoft uses ExactTarget for some of its email campaigns. Earlier this year, I discovered that there was a DKIM key that was old and could be rotated so that it used a longer key.
I worked with a contact at ExactTarget to create a new DNS record for a subdomain within microsoft.com, publish the public key while they updated the corresponding private key. We tested it, verified it worked, and we now have the key working in production.
Whew. That’s a lot of work with keys.
I still don’t feel very confident with key management. I feel like I am missing something (actually, many things) but I don’t know what. I also feel like the existing process that I have helped define I don’t understand well enough.
But at least I am learning.
Comments
Anonymous
February 15, 2016
Thanks for this - a nice overview of part of the key management problem space. I'm curious - for your DKIM problem you say you used a key vault. Was that something hand-rolled? A 3rd party tool?Anonymous
February 19, 2016
Azure Key Vault (or Azure Key Management System) is something that one of the teams here at Microsoft built to do key management in a cloud hosted service. As far as I know, we were the first team to make use of it in this way (storing private keys in the cloud).