Office 2010 File Validation

Howdy, I’m David B Heise and I work on the Office Security team responsible for testing Office File Validation (codename: Gatekeeper). There have been some misconceptions about the new file validation feature in Microsoft Office 2010 and I hope to clear these up and explain the why and what.

Why Validate Binary Files?

Throughout the years the office binary formats have necessarily evolved and grown in scale and complexity. The reasons why the formats are complex have been discussed sufficiently elsewhere (see Joel Spolsky's article here) so we won’t go into that discussion here, however these binary formats are very well documented here. We have found that malicious attackers use the binary files as an attack vector to infect a targeted user, as such we wanted to come up with a way to stop this from happening. One thing our team has been doing is whenever a new Office file format attack is reported to Microsoft we have been checking it with our validation to see how well we’re doing. So far, so very good!

What is The Gatekeeper?

Office File Validation is a feature that was originally introduced in Publisher 2007 to validate Publisher’s PUB files. It verifies that a particular binary file conforms to the application’s expectations. In Office 2010 we’ve expanded this feature significantly to include binary formats for Word, Excel, and PowerPoint. Please note that this feature is for binary formats ONLY (i.e. PUB, DOC, XLS, PPT, etc), this does not validate the XML based documents (i.e. DOCX, XLSX, PPTX, etc), nor does it validate macros or other custom items. What it does validate is the structure of the file, for example if you have a XLS file that has a FONTINDEX structure with the ifnt value set to 4 (which is an invalid value for that particular item) then it fails validation.

How Does It Work?

Whenever an un-trusted binary file (i.e. not in a trusted location and not a trusted document) is loaded by Word, PowerPoint, or Excel it goes through a check to see if it is a valid file. This check looks at the specific bits of the file that the application is about to parse, in other words the relevant OLESS Streams. If it is determined to be valid, it opens as normal, nothing to see…move along…move along. However if it is found to be invalid, it is sent (by default) to the Protected View.

image

If you click on that text you will be taken to the Backstage view where you will have the option to open the file in the full application experience. Please note that this is a trust decision that will mark this particular file as a trusted file, and as such, will NOT be validated the next time you open this file.

After you’re done with the file and close the application you may see a prompt like this:

image

This prompt only appears at most once every two weeks (per application) and gives you the option to send the failing file (or files) to us via Windows Error Reporting. Of course you can remove a file or two if you don’t want to share that information, but by sending us the file we can analyze it further to improve Office File Validation.

How do I control this?

Via Policy

We realize that many administrators (or security conscious users) may not like the idea of opening a file that fails validation, so there is a group policy to control the default action when a file fails validation. These policies are located under the application’s “Options\Security\Trust Center\Protected View” in the group policy templates and it is a per application setting.

image

Via the Registry

There are several registry keys that control various aspects of Office File Validation.

Common Keys

HKCU\Software\Microsoft\Office\14.0\Common\Security\FileValidation \ReportingInterval - This is a DWORD that controls the number of days between the showing of the dialog to send files to Windows Error Reporting.

HKCU\Software\Microsoft\Office\14.0\Common\Security\FileValidation\DisableReporting - This is a DWORD that if set to 1 will disable the showing dialog (and thus the sending of files) to Windows Error Reporting.

Application Specific Keys

For these examples I’m going to use “Excel”, but these also work for “PowerPoint” and “Word”

HKCU\Software\Microsoft\Office\14.0\Excel\Security\FileValidation\EnableOnLoad – This is a DWORD that if set to 0 Office will not validate files.

HKCU\Software\Microsoft\Office\14.0\Excel\Security\FileValidation\DisableEditFromPV – This is a DWORD that if set to 1 will not allow files to be edited that fail validation.

Excel Specific Keys

HKCU\Software\Microsoft\Office\14.0\Excel\Security\FileValidation\PivotOptions – This is a DWORD that controls specific options around validating pivot caches (for performance reasons) in files that have them.

0 = Never validate any pivot cache
1 = Validate the pivot cache in the following cases: (1) file is opened from the internet, and the platform marks the file locally as having come from the internet. (2) The file is a Microsoft Outlook email attachment. (3) The user specifically opened the file in protected view. (4) The file is opened from a known "unsafe location" locally where internet content is cached, and any special user-defined untrusted locations, unless protected view unsafe locations are disabled via (a different) registry key. (5)The file is opened and the pivot cache is parsed on load.
2 =Always validate all pivot caches

Via Scripting

For custom solutions built on top of Office there are a few interesting properties that have been added to the Application Objects that will disable file validation for that session. There is also an extra option for Excel to control the validation of Pivot Caches (i.e. the file cached data for pivot tables and charts). Here’s a powershell script example showing how to set these two options for Excel (but the FileValidation property would also apply for Word and PPT):

$excel = New-Object -comobject Excel.Application
# valid values are:
# msoFileValidationDefault = 0
# msoFileValidationSkip = 1
$excel.FileValidation = msoFileValidationSkip
# valid values are:
# xlFileValidationPivotDefault = 0 (do whatever you’d normally do, i.e. follow registry & default settings),
# xlFileValidationPivotRun = 1 (validate all pivot caches),
# xlFileValidationPivotSkip = 2 (don’t validate any pivot caches)
$excel.FileValidationPivot = xlFileValidationPivotSkip

That’s great, but does it Cook?

We have made specific strides to ensure that file validation is very fast. Yes, it now takes more time to open a file, but we’re generally talking milliseconds more. In fact, you’d be hard pressed to find a normal sized file that takes more than a second to validate, most files validate in the 1 to 100 milliseconds range. Of course if the file is huge and super complex and takes an hour to open already…then yes it will take more than a second, but you probably aren’t going to notice anyway. In addition to that if the file takes more than 5 seconds to validate (so we’re talking very complex files here) we give you the option to cancel and go straight to the Protected View. After all we couldn’t just let you open it normally because then hackers would just make a file that was really complex…then take over your machine, which is exactly what this feature is trying to stop.

image

In addition for any file that takes a long time to validate (if it passes validation, fails validation, or validation is skipped) will also be shown the same Windows Error Reporting prompt as a failing file; giving you the option to send us the file for further analysis.

In a Nutshell

In talking with the developers one day we imagined a conversation that went like this:

“So what have you been working on?”

“Office File Validation”

“What’s that?”

“A check on an Office file to make sure it’s ok”

“So, you spent the last two years writing a Boolean function?”

“Well…um…yes, but it’s an important function!”

At the end of the day the Office File Validation is really just a Yes/No function to inform the application if a file is valid or not, but that’s a really important function! In fact is also a really complex function, as anyone who’s ever even peeked into the file format specifications can attest. So there you have it, in a nutshell. Office File Validation will check your binary file to ensure the significant bits of your file are valid, and if you think we’re wrong you can either trust the file or let us know!

Comments

  • Anonymous
    January 01, 2003
    This is going to sound like a weird question, but is there any way to see or create a document that fails validation?  I'm creating some documentation and would really like the ability to take higher quality screen shots as well as being able to verify Group Policy settings are correct.

  • Anonymous
    December 16, 2009
    ok, the book on the table

  • Anonymous
    December 16, 2009
    Great article! :D

  • Anonymous
    December 16, 2009
    Deseo usar Microsoft Office 2010

  • Anonymous
    December 16, 2009
    RSS

  • Anonymous
    December 16, 2009
    Buenas noches

  • Anonymous
    December 16, 2009
    What determines whether a document in Protected mode appears with a amber (warning) Trust bar or a red (critical) Trust bar?

  • Anonymous
    December 17, 2009
    I have been receiving "Microsoft Office Activation Wizard" and don't know what I'm supposed to do. Owen

  • Anonymous
    December 17, 2009
    I just read this: http://arstechnica.com/microsoft/news/2009/12/microsoft-why-office-2010-wont-support-windows-xp-64-bit.ars. I am very angry. XP 64 is 4 years newer than 32-bit XP! I read in comments section that Access and Excel have dependency on MSOLAP. Please then support it in 32-bit mode. THIS IS XP!! And THIS IS OFFICE we're talking about. At least make it install and run but unsupported from technical support scenario. I really want Word's document map improvement and PowerPoint transitions.

  • Anonymous
    December 18, 2009
    CB asked why you get the amber warning versus the red warning bar. You get the amber warning if the file is being opened in protected view simply because it came from a risky location - an email attachment, an internet site, a file type blocked by your file block settings. We don't know for sure anything is wrong with the file, but we're opening it in the sandbox just in case. You get the red warning if file validation detects that the file might be malicious. In this case we've scanned the file and found that there is content in the file that looks like it should not be there in a valid file - you should be very sure that the file is safe before you let it out of the sandbox. Hope that helps...

  • Anonymous
    December 18, 2009
    Ben, thank you for the clarification. Another thing. The post refers to validation of older binary file formats. Are .xlsb files excluded from validation because they cannot be infected like .xls files?

  • Anonymous
    December 18, 2009
    non riesco a validare microsoft office proffesional 2010 e mancano 19 giorni. Come faccio ?

  • Anonymous
    December 19, 2009
    CB - short answer, yes, xlsb is a binary version of the xml format and so isn't quite as vulnerable as the other older file formats. In 2010 we've focused on the formats that we've seen be the most vulnerable to attack. In the future we expect to add more validation to more formats. Also, I wouldn't go so far as to say that xlsb or any of the new formats cannot be infected. They can, we just haven't seen very many exploits against them yet, and we think it is harder to infect them than the older formats.

  • Anonymous
    December 24, 2009
    But you have now made it so hard for the end user that I gave up after 6 hours trying to save or even print a document and I am a Tech. How do you think the average end user would fare?  Why do IT people fail to understand what the end user wants? Oh, by the way, they want it to work. End of story.

  • Anonymous
    December 24, 2009
    It is nice to see Microsoft institute these features as they are a step in the right direction for data security at the application level.  Over time I would like to see the scope increase to include all of the application files associated with Office 2010, higher confidence checks, and improved automated handling when a file is determined to be suspect.  A separate OS-based application which can show a quick snapshot to users which files fail confidence checks on their system, would be a nice addition to empower security savvy customers to proactively be notified and manage such outliers.  

  • Anonymous
    December 27, 2009
    I would second Matthew's suggestion of extending this to an O/S based validation scheme to cover all files which could have the potential to carry third-party attacks.

  • Anonymous
    December 29, 2009
    We use third party excel generators such as nativeexcel.net to create our excel exports during the day.  These third party generators create reports that fail your tests.  We have literally thousands of excel workbooks (set of 100+ generated every day and stored for audit purposes). Bringing up this error message would only infuriate our users.  Especially since people use this data to get their jobs done. We are interested in a way to completely disable this feature on all of our workstations.  Is there a way to completely disable this "feature", rather than hitting the registry on every machine?  Possibly a switch on the install or a group policy entry.  We just want to open the workbooks with no warnings and have no banner across the top.

  • Anonymous
    January 14, 2010
    The comment has been removed

  • Anonymous
    January 18, 2010
    Brendon: Could you provide some additional details on why it took you so long to print. Were you in Protected View and trying to print? Also may I ask whether you are running Beta1 or Beta2. In Beta2 we did make some improvements around making it easy to print and save out of PV.

  • Anonymous
    February 28, 2010
    great article... :D

  • Anonymous
    March 13, 2010
    The comment has been removed

  • Anonymous
    March 22, 2010
    The comment has been removed

  • Anonymous
    April 11, 2010
    Yea, I think a way to get detailed output from this feature that developers with code writing these file formats can use would be a great idea. Provide enough info so that it can be easily be look it up in the now-public Office binary format specs for developers to see what went went wrong so the code can be fixed. Because face it, only recently has these specs been available to public. Many developers for years had to take pains to reverse-engineer Office binary formats and had to for example guess what values are valid in fields.

  • Anonymous
    April 26, 2010
    "codename: Gatekeeper" Oh that's cute, the GateKeeper was a rogue security suite in the 1995 movie "The Net"