New hotfix for intermittent OCSP revocation failure issues on domain controllers available
A new hotfix for Cryptnet.dll on Windows Server 2008 R2 has been released which covers a scenario which could cause a Domain Controller (or any service doing frequent revocation checking of certificates, such as NPS or ISA Server) to get into a state where revocation checks started failing.
The revocation check failures on the DC would then in turn lead to smartcard logon failures for end-users
The catalyst for this was if an OCSP location was stamped on a certificate in the chain and this OCSP location became temporarily (or permanently) unavailable then revocation checks would start failing for that certificate - even if multiple CRL's stamped on the certificate where reachable and current.
The technical details are that under the right conditions the DC would be using a back-off value for the CRL and this backoff-value was being refreshed every time a similar request for the same CRL was received. The back-off value is typically used to avoid overloading the server hosting the CRL so that during the back-off time no further attempts to download the CRL from that server would be attempted - in this case it meant that the back-off time was never expiring and this in turn caused the DC to never attempt to download the CRL when the error conditions are present.
If the DC was rebooted the problem would be resolved until the next time the OCSP location became unavailable.
However, the issue would reproduce only very sporadically due to being masked if a valid cached CRL was available locally.
The update is available in http://support.microsoft.com/kb/2666300 and is recommended for any DC's servicing smartcard logons where a certificate in the chain contains an OCSP path.