1 + 2 = 1?

Does 1 plus 2 equal 3?   After last week’s sometimes acrimonious discussion about formulas in ODF, you may be glad to hear that IBM and Microsoft appear to agree on that answer to this simple question.  But OpenOffice.org is not so certain – maybe the answer is just 1 sometimes – and the question itself turns out not to be so simple after all.  Let me explain.

The State of ODF Formula Interoperability Today

What is the current reality of ODF formula interoperability?  Understanding the status of the ODF ecosystem will help clarify the set of issues and options that we faced when making the tough decisions we had to make about how to best support formulas in ODF spreadsheets.

For this example, I’ll use the latest released versions of two well-known ODF implementations: IBM Lotus Symphony (version 1.2, download here) and OpenOffice.org (version 3.1, download here).  I want to talk about current reality, so I’m not using any outdated versions of software (the OO build I’m using, for example, was released in the last week).  I also stayed away from unreleased or private beta versions that might become available sometime in the future, and I used the default settings for each application.

First, I fired up Symphony 1.2, and followed these steps:

  • Enter a numeric value of 1 in cell A1.
  • Format cell A2 as text, right-justified, then enter a 2 in that cell.
  • In cell A3, enter the formula =A1+A2.

In Symphony 1.2, here’s what I see:

image

After saving this spreadsheet as an ODS file, I open it in OpenOffice.org 3.1 and see this:

image

Clearly this is a problem.  The exact same data, in the exact same spreadsheet, when operated on with the exact same formula, provides different results.

Some might be tempted to say that formatting a cell as text and then using it in a calculation is dumb.  And I’d agree that there are few people who ever do such a thing intentionally.  But in a large complex spreadsheet, with thousands of cells involved in complex calculations, it’s easy to make mistakes like this.  In fact, if you’ve spent any amount of time at all creating complex spreadsheets, I’ll bet that on more than one occasion you’ve wasted a bunch of time trying to debug a problem that turned out to be caused by such mistakes; I know I sure have.

Similar issues arise with boolean values – what does it mean to “sum” a column of cells that includes both numeric values and boolean values?  Not all spreadsheet implementations agree on the answer to that question, either. This can create interactions between formatting and calculating – change the format of some cells, and the totals change in your spreadsheet.  Most users find such behavior very confusing, to say the least.

One of the most interesting things I found in my testing of these two implementations was that although they write different markup for formulas, the exact same interoperability problem occurs regardless of which application is used to create the spreadsheet.

If you create the spreadsheet in Symphony 1.2, as I did, the table:table-cell element has a table:formula attribute with a value of "=[.A1]+[.A2]" .  And this formula will yield a result of 3 in Symphony and 1 in OpenOffice.org, as described above.

If instead you create the same spreadsheet in OpenOffice.org 3.1, when you open it in Symphony 1.2 you'll see of:=A1+A2 in cell A3.  But after you manually correct the formula, this spreadsheet, too, will yield a result of 3 in Symphony and 1 in OpenOffice.org.

So these two ODF implementations do not have predictable formula interoperability, regardless of where you start.  And these are not obscure implementations – they are the latest released versions of the implementations from IBM and Sun, the two companies that together chair the ODF TC.  Even if both companies released fixes tomorrow, there will still be many copies of the current non-interoperable versions of these applications in use for a long time to come.  This is the state of formula interoperability among ODF spreadsheets today.

Fixing the Problem

This difference in behavior is a well-known issue among those who work with spreadsheet formulas.  As Rob Weir said three years ago “Automatic string conversions considered dangerous. They are the GOTO statements of spreadsheets.”  (One of the ODF TC members even has that line in his email auto-signature.)

How to manage string conversions is far from the only problem with spreadsheet interoperability across vendors (and even across versions of the same product in some cases). The current draft OpenFormula specification contains 254 notes (by my count) about other issues similar to this one.

The OpenFormula sub-committee of the ODF TC has worked hard to address this.  Here is an excerpt from the draft OpenFormula specification (emphasis added):

6.2.4 Conversion to Number

If the expected type is Number, then if value is of type:

  • Number, return it.
  • Logical, return 0 if FALSE, 1 if TRUE.
  • Text: The specific conversion is implementation-defined; an application may return 0, an error value, or the results of its attempt to convert the text value to a number (and fall back to 0 or error if it fails to do so). Applications may apply VALUE() or some other function to do this conversion, should they choose to do so. Conversion depends on the actual locale the application runs in, especially if group or decimal separators are involved. Note that portable spreadsheet files cannot depend on any particular conversion, and shall avoid implicit conversions from text to number.

After OpenFormula is approved and published, this approach, with its explicitly defined concept of “portable spreadsheet files,” will allow more predictable and consistent interoperability for ODF spreadsheet users.

But in the current environment, with no standardization of formula markup across major ODF implementations, users who want to avoid interoperability problems need to stick to a very conservative strategy.  As Burton Group analyst Guy Creese said last week:

“… this in-between time (between the OpenOffice.org de facto standard and the wait for the officially approved 1.2 standard) means there isn't one way to handle this problem. The vendors would like you to believe that there is (their way), but in reality there isn't. Ultimately, this will resolve itself over time. ODF 1.2 will be approved, and there will finally be an approved standard that everyone--IBM, Microsoft, Sun (Sun/Oracle)--can follow.

Until then, if an enterprise does want to use ODF, the best strategy is to stick with one productivity suite as a way to avoid these interoperability problems. That way, even if formula support is idiosyncratic, it at least will be consistent within the enterprise.”

 How Excel 2007 SP2 Handles ODF Formulas

The question of how to handle formulas in SP2’s ODF implementation was one of the tough decisions we faced in our ODF implementation.  We had made conformance to the ODF 1.1 specification a top priority, and yet the spec doesn’t specify a formula language. 

It seemed clear to us that we couldn’t simply omit the  namespace, as the current version of Symphony does.  That would be in violation of Section 8.1.3 of the ODF specification, where it says “Every formula should begin with a namespace prefix specifying the syntax and semantics used within the formula.”

What about using the same of: namespace that OpenOffice.org 3.1 uses?  We saw a couple of pretty serious problems with that approach as well:

  • It would not be interoperable with some existing implementations, such as the widely  used current version of IBM Lotus Symphony.
  • It is based on a draft specification that has not been finalized or approved as a standard, and therefore could still change.

What about using the oooc: namespace that OpenOffice.org 3.1 writes when you choose its ODF 1.1 compatibilty mode? That syntax is on its way out for everyone, and we saw no point creating yet another new implementation of something that is clearly going to be deprecated soon.  And it doesn’t solve the problem: OpenOffice.org 3.1 writes the oooc: namespace prefix in its ODF 1.0/1.1 compatibility mode, and those spreadsheets still can yield different results in OpenOffice.org and Symphony.

After a robust internal debate on the topic, it became clear what we needed to do to apply the first two of our five prioritized guiding principles for Office’s ODF implementation:

  • Adhere to the ODF 1.1 standard
  • Be Predictable
  • Preserve User Intent
  • Preserve Editability
  • Preserve Visual Fidelity

As we discussed in several DII workshops starting back in July of 2008 (with multiple ODF implementers and multiple ODF TC members in attendance), these guiding principles are in priority order. When we could not achieve them all, we choose the top ones first.

To adhere to the ODF 1.1 standard, we begin formulas with “a namespace prefix specifying the syntax and semantics used within the formula.”  Excel 2007 SP2 uses an msoxl prefix and write the formula attribute like this:

table:formula="msoxl:=A1+A2"

That fulfills our goal of adhering to the standard since ISO/IEC 29500 defines both the syntax and semantics of this namespace.  Then, to provide a predictable user experience across all spreadsheets, we elected to support this namespace, and only this namespace.

If I move my spreadsheet from one application to another, and then discover I can’t recalculate it any longer, that is certainly disappointing.  But the behavior is predictable: nothing recalculates, and no erroneous results are created.

But what if I move my spreadsheet and everything looks fine at first, and I can recalculate my totals, but only much later do I discover that the results are completely different than the results I got in the first application?

That will most definitely not be a predictable experience.  And in actual fact, the unpredictable consequences of that sort of variation in spreadsheet behavior can be very consequential for some users.  Our customers expect and require accurate, predictable results, and so do we. That’s why we put so much time, money and effort into working through these difficult issues.

What Does Excel 2007 SP2 Do With the Example Above?

The answer is that we agree with IBM: 1 + 2 = 3.

Excel does the same thing Symphony 1.2 does, converting the text “2” to a numeric 2 and using that value in the calculation, so that the total is 3.  Excel does this because this type of automatic conversion – which has been a popular Excel feature for a very long time – is allowed by the semantics of the formula markup language Excel uses.

The formula markup that Excel uses is based on the formula language defined in ECMA-376 and ISO/IEC 29500, and here’s what it says about type conversion in Section 18.17.2.6 (Types and Values) of Part 1 of IS29500:

An implementation is permitted to provide an implicit conversion from string-constant to number. However, the rules by which such conversions take place are implementation-defined. [Example: An implementation might choose to accept "123"+10 by converting the string "123" to the number 123. Such conversions might be locale-specific in that a string-constant such as "10,56" might be converted to 10.56 in some locales, but not in others, depending on the radix point character. end example]

Excel’s approach to formulas in ODF, as well as our approach to other difficult issues, is completely public and fully documented in the implementer notes for SP2.  As the note for this issue explains:

The standard defines the attribute table:formula, contained within the element <able:table-cell>, contained within the parent element <office:spreadsheet table:table-row>

This attribute is supported in core Excel 2007. This attribute is supported in core Excel 1. When saving the Table:Formula attribute, Excel precedes its formula syntax with the "msoxl" namespace. 2. When loading the attribute Table:formula, Excel first looks at the namespace. If the namespace is "msoxl", Excel will load the value of Table:formula as a formula in Excel. 3. When loading the Table:formula attribute, if the namespace is missing or unknown, the Table:formula attribute is not loaded, and the value "Office:value" is used instead. If the result of the formula is an error, the element <text:p> will be loaded and mapped to an Error data type in Excel. Error types not supported by Excel are mapped to #VALUE!

The Question of Syntax

I’d like to also address the issue of cell reference syntax in the ODF 1.1 specification, since that was also a topic of much discussion on several blogs last week.  I’ll start with some quick background for those who don’t wallow in standards documents for a living.

The English language is inherently an ambiguous thing,  and great literature sometimes uses the ambiguity to good effect.  Words can have more than one meaning, and verb phrases might be intended to go with one noun or with another, as in famously ambiguous job references like “You will be very fortunate to get this person to work for you."

Writers of technical standards like to use rules and procedures that are designed to avoid this sort of problem.  These rules, which place requirements on the  use of words like should, shall, must and may,  tend to result in a stilted writing style which gets tedious fast, but reduces the need to agree on what is “obvious” or “implied” when interpreting the meaning of the text later.

A standards document is said to contain both normative language and informative language.   The things you must do to comply with a standard are supposed to be in the normative part,  and things like examples and introductions are informative.   So that everyone can be  sure about which parts are which,  the normative parts use specific phrases like “shall” and “shall not” to clearly label the things the standard actually requires you to do.

So the debate about Excel 2007 SP2’s cell reference syntax comes down to whether the few sentences in the ODF 1.1 spec which cover this were meant to be informative or normative.  The section of ODF 1.1  in question does not use the words shall or must.   It introduces the topic with the phrases “typically” and “can include”.   In our reading of it,  this language makes that part of the specification informative, stating no requirements for implementers.

The ODF 1.1 spec is casual about applying the rules of normative language, and as a result ODF 1.1 has more than its share of ambiguity.  The ODF 1.2 draft, however, is already much improved in this regard, mainly through the great work of ODF editor Patrick Durusau.  The OpenForumla draft specification is extremely careful in its use of normative language, and that will help implementers a great deal when they sit down to write their software.

When Will Office Support OpenFormula?

This question has come up on some blogs, so I’d like to address it here as well.

The real question is “when will Office support ODF 1.2,” since OpenFormula is simply a part of the ODF 1.2 specification.  And the answer is that we don’t know yet, because nobody knows yet when ODF 1.2 will be published as an OASIS or ISO standard.  As I said in the previous post, “we will look closely at Open Formula when it becomes a standard and make a decision then about how to best proceed.”  (It looks like IBM has committed to supporting ODF 1.2 and OpenFormula in late 2010.)

In the meantime, if you want to use Excel 2007 SP2 to edit documents that contain formulas from OpenOffice.org or Symphony, and preserve those formulas through editing sessions, and you understand the risk that the results might not be the same, you have a couple of free options.

The Open XML / ODF Translator Add-Ins for Office can be used with Office 2007 SP2, and as covered on the translator team blog, supports a variety of formula namespaces.

The Sun ODF Plugin provides yet another option, and apparently works with SP2.

Comments

  • Anonymous
    May 09, 2009
    PingBack from http://microsoft-sharepoint.simplynetdev.com/1-2-1/

  • Anonymous
    May 09, 2009
    Nachdem einige bemängelt haben, dass Microsoft nicht standardisierte Dinge so umgesetzt hat, wie es der Standard vorgibt (mit eigenen Xml-Namensräumen), hat Microsoft einen Kurzkommentar zu Formeln und ODF (Das OpenOffice-Format) abgegeben: http://blogs

  • Anonymous
    May 10, 2009
    The comment has been removed

  • Anonymous
    May 10, 2009
    Notes on Document Conformance and Portability #3

  • Anonymous
    May 10, 2009
    Doug, It is the second time on the past days that you talk about discussion with "multiple ODF TC members" on workshops about your formula implementation, as if the ODF TC approves the mess you guys did. As a TC member, I'm offended with this tentative to confuse the audience, and I would like o state that I never accepted any Microsoft invitation to attend interoperability workshops, because I knew you guys could use my participation on those events as if it was my approval of your stupid decisions, and it seems that I was right. To clarify things, and to be fair with othe TC members, I would like to ask you to name all the TC members that attended those workshops, to allow us to verify if they approve (as you're trying to say) Microsoft's decisions. If you can't (or don't want) to do that, please stop talking about the ODF TC this way, and don't involve TC members with Microsoft business decisions (on this episode, yours 'throw mud on everyone' strategy is going far from the acceptable limits). This also raise additional suspicious about your real intention on the ODF TC (at least to me).

  • Anonymous
    May 10, 2009
    The comment has been removed

  • Anonymous
    May 10, 2009
    Citing Alex Brown: "When [OpenFormula] is published vendors will cease to have an excuse to create non-interoperable implementations, at least in this area. " All this Mahugh "essay" with screenshots! ( boy, it is really hard to justify why didn't you put a square bracket in a formula uh? ) can be represented with just one word: Excuse. Keep playing the standards game Microsoft, it is funny , isn't it?

  • Anonymous
    May 10, 2009
    The comment has been removed

  • Anonymous
    May 10, 2009
    Doug, you identified an interoperability problem between Symphony and OpenOffice.org.  OK, it is well known that this is not the only one: this is why OASIS created the "ODF Interoperability and Conformance" TC, in order to identify and solve the interoperability issues. But I really cannot see how an existing interoperability issue justifies that Microsoft creates much more serious interoperability problems. It looks like your goal is in fact to create as big as possible a mess, surely not to achieve any interoperability goal.

  • Anonymous
    May 10, 2009
    Although not an OASIS member at the time, I was at that July 30, 2008 DII meeting in Redmond where the principles were first discussed and there was discussion of how they were looking at handling ODF table:formula values for Excel 2007 SP2.  My personal observations on that are posted at <http://orcmid.com/blog/2008/08/microsoft-odf-interoperability-workshop.asp>.   I'm not sure anyone there flinched about what Excel would be doing, although I see that I was hoping for more that what was described.   I'm not so sure it matters that there were any ODF TC members present at this or the previous Redmond DII, and I don't know that they would have seen a basis for objecting.   Although we can argue all we want about whether or not a cell-reference scheme is mandated, the proposed accomodation that Rob Weir insists on is at the level of practical, partial interoperability, not ODF 1.1 conformance.  And to do that you'd have had to introduce yet-another namespace or else only accept the other .ods ones and not produce any of them, something I somehow doubt Microsoft wants to be seen doing.   As painful as the current solution is, I don't see a mutual way out absent a standard OpenFormula that all can start to converge on for increasing interoperable spreadsheet usage with ODF.  The harmonization of OOXML SpreadsheetML formulas and OpenFormula then becomes a meaningful opportunity as well.  I am impressed by the degree to which OpenFormula work seems poised for that.

  • Anonymous
    May 10, 2009
    The comment has been removed

  • Anonymous
    May 10, 2009
    The comment has been removed

  • Anonymous
    May 10, 2009
    @Doug Mahugh: The crux of the problem can easily be summarized as follows: The whole point of the namespace prefix on a formula, is that a spreadsheet can consist of cells, whose individual formulas are from a heterogenous namespace (or set of namespaces). The whole point is, that an ODF document is flexible enough to encode multiple application-dependent behaviors, so that they are preserved across edits by multiple application. Yes, individual cells whose namespace is foreign to a particular application MIGHT be incorrectly (or inconsistently) calculated. However, native namespace formulas, by definition, will not. Having an ODF spreadsheet with data cells A1, A2, and A3, whose contents are arbitrary values, and cells B1 and B2, whose formulas are of:=A1+A2, and msoxl:=A1+A2, should not only be legal, but should survive unaltered, across being opened, edited (in an effectively trivial fashion), and saved, by ANY conforming ODF application. Yes, the value of B1 might be wrong, or B2, or even both, in some third party application. But back in their native application, the respective values WILL be correct. Having Application X incorrectly calculate the behavior for Namespace Y, is a problem for Application X to address, at least for the most popular values of Y. This can be seen as a differentiator between applications. Ultimately, this should be resolved by OpenFormula, and by extension ODF 1.2, but in the meantime is something that can be gleaned by taking a "primer" spreadsheet of base values/types and formulas, and comparing the expected versus observed values for each formula in each namespace. By encoding the anomalies and backwards engineering the "problems", it is possible to universally compute the behavior of foreign namespaces - at which point perfect encode/decode/calculate is trivially possible for each such namespace. An internal structure consisting of pairs (foreign namespace formula, converted formula) would allow for efficient internal operation of the calculation engine, and preserve formulas on export. If I can write this up in ten minutes, how is it that a small army of developers can't or won't do this at Microsoft? The number of extant namespaces is quite small, and the necessary logic to identify only the "exceptions" cannot be that large. If 124 is the number of footnotes where there is a potential problem, there can be at most N x 124 entries in a mapping table - where N is certainly not ridiculously large. Any comments on this suggestion/observation? Name Withheld

  • Anonymous
    May 10, 2009
    Rob - There is no reliable interoperability for spreadsheet formulas between any of the major implementations of OASIS ODF 1.1.  People like James Clark, Tim Bray and David Wheeler were warning of this outcome years ago, when you and others decided to leave formulas out of the standard.  Is the formula handing in the current releases of Symphony and OpenOffice.org, as demonstrated above, the sort of "compatibility" you're asking us to participate in?  That may be good enough for you and Maya, but in my experience, most users expect the same data and the same formula to return the same result consistently. Fmerletti - I'm surprised to hear you refer to the current ODF formula interoperability between Symphony and OpenOffice.org, which I've demonstrated above, as "state-of-the-art formula handling in ODF implementations."  I take it, then, that you agree with Rob that such inconsistencies meet the "good enough" bar for most users? Anonymous - your thoughts on how to apply reverse-engineering techniques to ODF formulas to infer the behavior of foreign namespaces is a great example of the root problem here: ODF formula interoperability doesn't exist, and requires reverse-engineering of every implementation by every other implementation.  Standards are supposed to prevent that situation, not require it.

  • Anonymous
    May 10, 2009
    As someone working in organisations working with spreadsheet I can easily confirm that is is better to have spreadsheet that show correct values then to have  spreadsheet capable of calculations but without correct values. So MS Office implementation choice might not be ideal but suggested alternatives preferring formulas to be left intact even though this might confront users with incorrect data are totally unacceptable.

  • Anonymous
    May 10, 2009
    The comment has been removed

  • Anonymous
    May 11, 2009
    The comment has been removed

  • Anonymous
    May 11, 2009
    The five prioritized guiding principles are, according to you:    * Adhere to the ODF 1.1 standard    * Be Predictable    * Preserve User Intent    * Preserve Editability    * Preserve Visual Fidelity Doing reverse-engineering would achieve all five. So why did Microsoft choose not to do this? Yes, Standards are supposed to prevent this. However, ODF 1.1 specification does not include formulas. ODF 1.2 is expected to do so. Neither of these precludes doing the backwards-engineering solution. Do you have any explanation for Microsoft's choice in this regard? Name Withheld

  • Anonymous
    May 11, 2009
    Luc, you are describing additional interoperability problems beyond the one I used as an example.  You're correct, bugs in OO create additional challenges, but that doesn't change the fact that the formulas saved by the current versions of Symphony and OpenOffice.org are not interoperable with one another. Anonymous, your arguments in favor of a reverse-engineering approach continue to reinforce the point that spreadsheet formula interoperability does not exist in ODF documents unless such techniques are used.  We agree on that point.

  • Anonymous
    May 11, 2009
    The comment has been removed

  • Anonymous
    May 11, 2009
    Name Withheld asked: "Do you have any explanation for Microsoft's choice in this regard?" The reason is simple: they have found a bug in OpenOffice with 0.00001 of probability in real use cases ( an user has to explicetely format a number as text and expect good results in formulas ).   So, if there are this kind of bugs in ODF implementations , then there is no need to make hard work and put square brackets in formulas to be interoperable. Amusingly sad.

  • Anonymous
    May 11, 2009
    Doug, what I described above is not "additional interoperability problems", it is a simply an implementation-defined feature as specified by OpenFormula, that you distort as an interoperability feature. OpenFormula says about conversion from Text to Number: "2.2 Variances (Implementation-defined, Unspecified, and Behavioral Changes) [...] Some conversions between types are not required to be automatic. In particular, applications may, but need not, perform automatic conversion of text in a cell when it is to be used as a number (see Auto Text to Number). Creators of portable documents shall use conversion functions (such as VALUE, DATE, DATEVALUE, TIME, and TIMEVALUE) when this specification does not require such conversions. Note that some of these functions are locale-dependent, and their use can result in documents not working the same way in different locales." "6.2.4 Conversion to Number If the expected type is Number, then if value is of type: [...] Text: The specific conversion is implementation-defined; an application may return 0, an error value, or the results of its attempt to convert the text value to a number (and fall back to 0 or error if it fails to do so). Applications may apply VALUE() or some other function to do this conversion, should they choose to do so. Conversion depends on the actual locale the application runs in, especially if group or decimal separators are involved. Note that portable spreadsheet files cannot depend on any particular conversion, and shall avoid implicit conversions from text to number." (emphasis is mine) So, in fact OOo is already implementing the recommendations of OpenFormula to ensure portable spreadsheet files !  The issue, from an OpenFormula portable documents point of view, is in fact in Excel and Symphony, not in OOo. In conclusion: nice try, but should find a real interoperability issue between Symphony and OOo to be credible.  Don't despair: I'm sure that there are such interoperability issues, and it would be nice if you can help debugging OOo and Symphony ;-)

  • Anonymous
    May 11, 2009
    @dmahugh: The reason OO.org 3.1 doesn't calculate the formula the same way is that Symphony 1.2 didn't properly output to the OpenOffice.org formula language, not because it's impossible to have interoperable formulas with ODF 1.1. Had Symphony outputted "=[.A1]+VALUE([.A2])", OO.org would have calculated correctly.   The flaw is in the assumption that "=A1+A2" in Symphony and Excel is the same as "oooc:=[.A1]+[.A2]" in the ODF formula attribute, which is wrong. In reality, you have convert from the native formula language used in the program to the document formula language, and that is not accomplished by merely changing the syntax anymore that translating from Spanish to English can be accomplished by a change of punctuation. Interoperability is a formidable challenge and should not be taken lightly, but you place the greatest challenge in the wrong place. You assume that interoperability is difficult because of the format does not permit it, which is in error. For instance, outputting "oooc:=[.A1]+VALUE([.A2])" would have been understood by the majority of ODF 1.1 capable spreadsheet programs and would work as expected. The greatest challenge for interoperability is ensuring that you translate your data in a way that can be understood by the greatest number of applications. In this regard, I do not see Microsoft putting forth a good faith effort. What I see is spec legalism, observations of obscure interoperability bugs, and calls for other vendors to implement Microsoft's formula language so that Microsoft can become interoperable with other vendors. The simple fact is that Microsoft chose ease of implementation and convenience over interoperability. If the level of interoperability we want is unreasonable, just say so, but don't waste our time with petty arguments about how different office suites handle some obscure situation nobody cares about.

  • Anonymous
    May 11, 2009
    Fmerletti, there are hundreds of these sorts of issues in the OpenFormula spec.  If you'd like to learn more about them, you can download the latest draft here: http://www.oasis-open.org/committees/documents.php?wg_abbrev=office-formula Luc, the continued hair-splitting over how to interpret the spec makes my point.  As some practical advice on how to approach such debates, you would be well-advised to listen most closely to the opinions of those who are not selling specific implementations or engaged in lobbying efforts for a particular cause. Matthew, some users do in fact care about the issue of whether spreadsheets will display the same values reliably on different systems.  For those like yourself willing to take risks, the various add-in solutions are great options.  They're free, install in seconds, and give you everything you're asking for. Razvan, I agree that when OpenFormula is an ISO standard we'll enter an era of new possibilities for ODF formula interoperability. Marc, given that you posted your Gnumeric example simultaneously on multiple blogs, and it has already been responded to appropriately on Alex's blog, I'll direct people there for further information: http://adjb.net/post/Notes-on-Document-Conformance-and-Portability-3.aspx

  • Anonymous
    May 11, 2009
    The comment has been removed

  • Anonymous
    May 11, 2009
    The comment has been removed

  • Anonymous
    May 11, 2009
    I have found another interoperable implementation of ODF: EditGrid.com I created a simple spreadsheet and saved as ODF. The formula was stored as: "oooc:=[.B2]+[.C2]+[.D2]" This is coincident with the other main implementations of ODF. The only one ( as far as i know ) non interoperable implementation is Office 2007 SP2 There is a live demo at http://www.editgrid.com/site/try/demo

  • Anonymous
    May 11, 2009
    @JCGA Unfortunately you need a rather deeper understanding of the situation to understand the points I was making.  The usual audience is very astute and well apprised of the facts. Read up a little on it and I'm sure it will all start to make sense to you. Perhaps the English sense of comedy is lost on you as well - sorry about that. If you still have problems, let me know and I'll produce an annotated version, it might take a little while though. Thanks Gareth

  • Anonymous
    May 11, 2009
    JCGA, you miss the point - SP2 produces compliant, confirming to the normative language of ODF 1.1 standard, documents. It does not produce documents that confirm to the informative part of the standard, to Open Formula draft from ODF 1.2 draft and to many current implementations of the above.

  • Anonymous
    May 11, 2009
    @Doug Mahugh: Your comments on normative vs informative are either deliberately misleading, or you and the entire MS team don't understand important normative language that is not just "typical" but both precise and necessary. I'll spell it out for both you, and for your blog readers, so both cases are covered. The term "should" is in fact a term with specific meaning and intention in standards. What it is, is a place-holder for "must", which is intended for new implementers to treat as "must", but which relaxes the "must" aspect so that the standard, when adopted, does not immediately turn previously conforming instances into non-conforming -- a pass a` faire, to use a chess term. Ignoring "should" because it doesn't happen to actually say "must", is considered a particularly egregious and blatant failure to both understand the language of standards, and failure to implement the standard. The section in ODF concerning formulas, may not have any "must" or "shall" terms, but is chock full of the word "should". And the paragraph that includes the sentence that starts "Typically, ..." that has been used as a "straw man", can only be unambiguously and self-consistently be interpreted one way - it is a flawed attempt to use English to describe what would be better said in BNF. It might have better been written as something like, "A Cell-Formula should begin with a namespace prefix, followed by the Formula-Body. The Formula-Body may begin with an =, and consists of the zero or more of the following elements:". However, that it written poorly does not mean that it is written ambiguously. Any other interpretation of the two occurrences of formula, which are the noun phrases "formula" and "formula itself", lead to self-inconsistent results -- if a formula began with a "=", it couldn't start with a namespace prefix, and vice versa. The "should" versus "must" are clearly laid out in http://www.ietf.org/rfc/rfc2119.txt - which is the second "hit" when doing a google search on the terms "standards should must". I'm interested in your comments on these observations. Name Withheld.

  • Anonymous
    May 12, 2009
    The comment has been removed

  • Anonymous
    May 12, 2009
    The comment has been removed

  • Anonymous
    May 12, 2009
    Standardul ODF sau Open Document Format (implementat de OpenOffice, Symphony, Microsoft Office și altele)

  • Anonymous
    May 13, 2009
    @Doug: "It’s interesting to note that we have discussed this very issue at a DII workshop.  Last July, we had a workshop in Redmond [...]" It is also interesting to note that Microsoft is a member of the OASIS ODF Interoperability and Conformance (OIC) TC.  Yourself and Stephen Peront are attending the meetings of this OIC TC. Did you ever submitted this issue (how to have interoperable formulas in ODF 1.1 implementations) to the OIC TC ?  It seems to me a very good place to discuss this subject.  It you never asked the question to the OIC TC, what are the reasons ?

  • Anonymous
    May 13, 2009
    The comment has been removed

  • Anonymous
    May 13, 2009
    Luc – The ODF TC actually is not a very good place to discuss this subject.   The problem that there are no interoperable formulas in ODF 1.1 is already well known to the members of the ODF TC,  and has been known for a long time.  (For examples, follow the links from Stephen McGibbon's post of three years ago: http://notes2self.net/archive/2006/07/12/446.aspx) The ODF TC is working on ODF 1.2 now, and already has plans to address the problem through OpenFormula in ODF 1.2.    So there really was no point in bringing up the question with that group yet again.

  • Anonymous
    May 13, 2009
    Doug, You want to justify a very stupid decision (breaking basic interoperability with all previously written ODF spreadsheets) by pointing that interoperability also fails for other vendors on very a specific, stupid and unuseful scenario ("summing text strings"). Do you really expect to be taken seriously?! Get real, this is worse than the XLS interoperability we have now. On the other hand, how are we surprised? We should expect this. And probably it's not even your decision. Microsoft's stance has clearly changed from "we'll never do that" to "we'll do it and claim it loudly (but sshhh.... late and broken enough so that it doesn't get useful - as was with Mono, Moonlight  and OOXML)" So are you guys shipping a fix for this nonsense? Or will you move forward on spreading non-interoperability FUD over ODF? In your place I'd be embarrassed. But tell us that this was just a bad decision, that you'll fix it quickly and, yes, that we can trust Microsoft's commitment to interoperability.

  • Anonymous
    May 13, 2009
    Ghomem, let me just excerpt myself from my own initial post above: “Some might be tempted to say that formatting a cell as text and then using it in a calculation is dumb.  And I’d agree that there are few people who ever do such a thing intentionally.  But… it’s easy to make mistakes like this.  [This is just one case of many where] ODF implementations do not have predictable formula interoperability.  Our customers expect and require accurate, predictable results, and so do we.”

  • Anonymous
    May 13, 2009
    When I blogged about the release of SP2 with ODF support two weeks ago, I mentioned that I was planning

  • Anonymous
    May 13, 2009
    @Doug: "The ODF TC actually is not a very good place to discuss this subject.   The problem that there are no interoperable formulas in ODF 1.1 is already well known to the members of the ODF TC,  and has been known for a long time." The problem (no formulas in ODF 1.1) is indeed well known, for a long time. So is the solution: use the oooc: prefix and the OOo formula syntax. I maintain that, as Microsoft decided to break the interoperability solution used for a long time by ALL the ODF implementations (including the so called CleverAge add-in), it would have been appropriate to bring the subject to the OIC TC, explain the reason of your choice, and at least reach a common view on the impact of this choice.  This would have been more appropriate than discussing it with Rob Weir during a DII workshop. Microsoft cannot be credible when you participate to a TC to improve future interoperability and at the same time take decisions to break the existing interoperability, without even mentioning it and having a serious discussion about it.

  • Anonymous
    May 14, 2009
    The comment has been removed

  • Anonymous
    May 14, 2009
    Doug Mahugh and a bunch of the standards crew (both in and out of Microsoft) have been having a great

  • Anonymous
    May 16, 2009
    I like the new "hit" you have against OO. I think it's great to see some of these things however Excel isn't perfect.

  • Anonymous
    May 17, 2009
    @dmahugh Are you guys going to fix this problem quickly? Or shall we assume the ODF spreadsheets can't be exchanged with MS Office 2007? Your credibility is at stake.

  • Anonymous
    May 17, 2009
    " Our customers expect and require accurate, predictable results, and so do we.” HA HA HA HA HA HA HA HA HA HA HA HA HA HA HA HA HA HA This is Microsoft.  Creator of the spreadsheet which wouldn't recompute certain lines -- Excel 97, was it?

  • Anonymous
    May 17, 2009
    Here's my take on this as representative of an organization of end users. We need interoperability that does not depend on proprietary formats. This is non-negotiable. ODF is the choice. Do ODF spreadsheets present challenges of the sort you describe? Definitely, and we are eager for the ODF industry to sort this out. As far as we are concerned, Excel 2007 SP2 implements ODF as a proprietary, non-interoperable variant, ie. not ODF-standard in any practical sense. Sure, it would undoubtedly be a lot harder for MS to build in the kind of ODF interoperability that we require and, yes, MS has grounds to complain about the incomplete standard for ODF spreadsheets. When you think about it, it's a rather similar situation to the nightmare that MS imposed on the world with its proprietary pre-OOXML formats for so long, at least from an end user's perspective. Doug, why doesn't MS sell us the product we need rather than making things worse for end users? We need cross-platform interoperability and we can't limit it to one vendor's office suite. With the MS solution, something has to go - there's just no choice. So Excel 2007 SP2 disqualifies itself, even though we'd dearly like for it not to. As end users, we don't care about your implementation problems. You could be working with the ODF industry to expedite an acceptable solution for us, but instead you're making things worse. Get it through your heads - MS-specific ODF completely defeats the purpose of going with ODF and we are definitely going with ODF. Stop waggling your finger at others and become part of the solution instead of the problems.

  • Anonymous
    May 18, 2009
    @End user It is not MS specific ODF. It is actually an ISO standardized spreadsheet language and one that is extremly easy to implement for most spreadsheet applications. Gnumeric and KOffice already can support the MS spreadsheet language in ODF. An application like OOo already contains all the libraries for parsing and interpreting the OOXML spreadsheet language and I bet a Sun hacker already build support for the MS Office spreadsheet language in ODF in about the same time that it took Rob Weir to write about it.

  • Anonymous
    May 18, 2009
    @hAI Yes, Microsoft decides to be incompatible with all previously done work and the world will play catch-up. That's very fair and very honest.

  • Anonymous
    May 18, 2009
    The comment has been removed

  • Anonymous
    May 19, 2009
    ghomem said: "    @dmahugh    Are you guys going to fix this problem quickly? Or shall we assume the ODF spreadsheets can't be exchanged with MS Office 2007?    Your credibility is at stake. " I second ghomem question, when do you plan to fix this? Thanks for your "cooperation"

  • Anonymous
    May 20, 2009
    An excelent take on this issue ( in my POV ) here: http://blogs.zdnet.com/BTL/?p=18317 This is from Jeremy Allison, a man who knows something ;-) about real interoperability ( Samba team )

  • Anonymous
    May 20, 2009
    If the odf-converter plugin which is supported by Microsoft can get it right and be interoperable, why on earth can't Excel? You already have code that does it right - why go out of your way to break compatibility?

  • Anonymous
    May 20, 2009
    The comment has been removed

  • Anonymous
    May 20, 2009
    Maybe we are looking at this wrong. Whether you like or hate Microsoft, MS Office is far more popular than OOo or any of the other variants put together. Microsoft, for whatever reason, has not managed to create a system which allows users (remember us - the guys who actually buy your software) to load, edit and save files from other people using OOo etc. Surely then the solution is for OOo and its variants to produce a system that allows saving files in MS Office formats? I suspect that would throw the problems Microsoft has had right back onto all you critics and you would then be the ones with egg on your faces!

  • Anonymous
    May 21, 2009
    @doug: You state "Our customers expect and require accurate, predictable results, and so do we." But no, you do not. The point you bring up has nothing to do with interoperability problems between OpenOffice and Symphony. It is a general problem that occurs when mixing strings and values, and the very same problem occurs in Microsoft Excel as already pointed out by various people. Please repeat the following simple experiment (I did it using MS Excel 2000 on Windows XP).

  1. Go to the configuration screen of Windows XP, Open "Regional and Language Options", tab "Regional Options", button "Customize". Enter "." as Decimal symbol and "," as digit grouping symbol. Click "Apply".
  2. Load Excel and open a new sheet. Format A1 as text, then enter 1.000 in the cell. Enter 2 in cell A2, and enter =A1+A2 in cell A3. A3 will now show the answer 3, as expected. Save your sheet and close Excel.
  3. Go again to "Regional Options" and set "," as Decimal symbol and "." as digit grouping symbol (switching the two). Click "Apply".
  4. Load Excel and open the sheet again. A3 will still show the value 3. Now enter a different value in A2, for example 0, and leave the cell. Then, enter again the value 2 in the cell (as it was after loading). This will update the formula in A3. As result A3 will show the answer 1002 instead of the previous answer 3. Your issue has nothing to do with ODF itself. It is about parsing strings and values. So please don't blame ODF for it. The issue can possibly be solved by parsing value+string as value.toString()+string (resulting in "1.0002"), instead of value+string.toValue() (resulting in ambiguous results). For example Javascript does this. At least such a solution will make users aware when they accidentally format values as text.
  • Anonymous
    May 21, 2009
    Looks like it may not be getting an answer. Also looks like the odf-converter team didn't have such problems: http://odf-converter.sourceforge.net/newblog/index.php?2009/03/13/28-how-the-openxml-odf-translator-deals-with-formulas

  • Anonymous
    May 21, 2009
    @Magus When the discussion was about OOXML Microsoft and all their friends cared about "preserving the corpus of existing documents". But now you say the critics should shut up, change all their implementations and ignore the existing documents, just because Microsoft did not bother to implement things properly (as they already had on odf-converter). Cool.

  • Anonymous
    May 21, 2009
    Guys, I've looked through the latest batch of comments but don't see any new information or new questions.  From the sheer amount of repetition in recent comments, it's clear that everyone has made their point. Ghomen, the post above answers your questions.  I must say, it's amusing to hear person who has left 12 critical and repetitive comments on this post alone, every one of which I've allowed, make the claim that I'm asking critics to shut up.