How to retain alternate data stream associated with a word document while converting it to newer file format version
Recently I encountered a not so common [;-)] scenario while working with Word 2007:
If you have a Word 97-2003 format document which also has an alternate data stream associated with it, converting this document to the newer file format (DOCX file format) in Word 2007 may result in the loss of the associated alternate data stream (ADS).
After researching this, I found that this behavior is by design of Office 2007 convert operation, and it will become more evident from the following, which elaborates what Word 2007 does when a user selects this option.
When you open a Word 97-2003 file format document in Word 2007, you’ll see a new menu-item “Convert” in the Office Pearl menu, as shown below:
On selecting this menu-item, Word will show you a warning saying converting the document to newest format might result in change of the document layout. On selecting OK for that message box, you will assume that the file is now converted to the desired format, but wait that’s not entirely true [:-O].
Instead of immediately converting the document, Word will just mark the current document for conversion to the newest format on the next save/close operation. You can either chose to work with the document or close it immediately. In either of these case, you get a message box saying “Do you want to save the changes”, and clicking “Yes” will actually convert the document before closing it.
Here, conversion of the document to newest format means that Word actually “creates” a new document (DOCX format) and “deletes” the older one. And you might have guessed by now that due to this process, the alternate data stream associated with the original file gets lost.
Now, how do we retain the alternate data stream associated with the older file. Here are some of our options; you can chose either of these based on your scenario:
- Repurpose the Convert button implementation using CustomUI XML
- Provide a custom button to user for performing this conversion
Option #1
Here, we need to repurpose Office Ribbon command ID “UpgradeDocument”. I will not go into the details of creating a project with custom ribbon XML here; you can refer to following articles to get in-depth documentation for this:
- Customizing the 2007 Office Fluent Ribbon for Developers (See parts 1, 2 and 3)
- Temporarily Repurpose Commands on the Office Fluent Ribbon
- Extend Office 2007 With Your Own Ribbon Tabs And Controls
Here is my CustomUI XML for repurposing the Convert button in the Office menu:
<?xml version="1.0" encoding="UTF-8"?>
<customUI xmlns="https://schemas.microsoft.com/office/2006/01/customui" onLoad="Ribbon_Load">
<commands>
<command idMso="UpgradeDocument" onAction="MyConvertHandler"/>
</commands>
</customUI>
Option #2
For implementing this, we can follow the same path of adding a custom UI XML as in Option #1, and instead of adding a <commands> node, we need to add a custom button for user to click on for converting the document.
For more information on how to add custom buttons to ribbon, you can refer to following articles:
- Customizing the 2007 Office Fluent Ribbon for Developers (Part 1 of 3)
- Customizing the 2007 Office Fluent Ribbon for Developers (Part 2 of 3)
- Customizing the 2007 Office Fluent Ribbon for Developers (Part 3 of 3)
You may also say why not just intercept the Convert document event and add a custom logic to retain the ADS (pretty obvious, huh!). Unfortunately there is no such event available in Word for this operation. We can, however, intercept other events like DocumentBeforeSave/DocumentBeforeClose and form a logic for retaining the alternate data stream of original file.
Implementation
OK, now that we have a way to intercept the Convert operation, all we need is the logic to retain the ADS for our file. This we will implement as below:
- Read the alternate data stream associated with the original file
- Convert the document programmatically
- Call Document.Convert()
- Call Document.Save() to force Word perform the conversion.
- Create a new alternate data stream and associate it with the new file.
- Fill the content of the old stream to the new one.
Please see the attached sample project to see an example of this implementation using Option #1.
Things to try
Excel 2007 and PowerPoint 2007 also exhibit the same behavior as Word 2007, i.e., the associated ADS also get lost when an opened workbook/presentation is converted to a latest version. The good part is, you can implement a similar approach for them too, to retain the ADS.