Ever since the OpenDocument standard was ratified by OASIS in May 2005, it’s been gathering steam. In May 2006, ISO formally approved a draft of OpenDocument as standard ISO/IEC 26300. KOffice has completely switched to OpenDocument as their native format (OpenOffice.org did this long before), and more implementations are being announced each month. Massachusetts continues to plan to switch to it in spite of some nasty politics, and all evidence suggests increasing use worldwide. Wikipedia’s article on OpenDocument gives more information about it, including OpenDocument adoption.
But is OpenDocument really an open standard, or not? For example, can anyone implement it? Was its development process completely controlled by a single party (which would not be open), or is there evidence that it’s a consensus result by many? It’s generally accepted that OpenDocument is an open standard, but recently I’ve been told that some people are claiming otherwise. So let’s figure out what the criteria are for an open standard, and then see if OpenDocument meets those criteria.
There’s no single definition of the term “open standard”. That’s true for most words and phrases, actually. But lots of documents hint at what it means, for example:
Let’s first look at the two definitions of “open standard” that seem to be the most widely used. The first is by Bruce Perens; the second is by Ken Krechmer (Fellow of the International Center for Standards Research). These two are so widely used that when I did a Google search on “open standards” these were the second and fourth results respectively (the first and third were OASIS and the W3C, two standards bodies that create open standards). We’ll then look at the European Commission’s definition of open standards, which is a formally approved definition of the term (and one that European governments use).
Then, after we’ve looked at these three definitions, we’ll create a merged definition that includes all of their requirements (from all three sources). That way, if the specification meets this merged set of requirements, we can be very confident that we have an open standard; such a specification would meet all three definitions.
A very popular definition of the term “open standards” -- according to Google the most popular -- is Bruce Perens’ “Open Standards: Principles and Practice”. You’re best off reading the actual paper for its full content, of course. Let me summarize it by quoting its list of principles that it states a specification must meet to be an open standard:
Another popular definition is the set of requirements for open standards created by Ken Krechmer, Fellow of the International Center for Standards Research (University of Colorado). He’s published several versions; here I’ll summarize the February 7, 2005 version of “Open Standards Requirements”. He looked at standards from the viewpoint of recognized standards-setting organizations, implementors, and users, and tried to find some middle ground merging their desires. He claims that an open standard must meet the following requirements:
Clearly, these definitions have a lot in common. Ken Krechmer wrote his paper after Perens, and compares his list to Perens’. Krechmer maps each of Perens’ 6 points to his own list of ten as follows:
Perens | Krechmer |
---|---|
Availability | Open Documents |
Maximum end-user choice | Open Access |
No royalty | Open IPR |
No discrimination | Open Meeting, Consensus and Due Process |
Ability to create extension or subset | Open Interface |
Ability to prevent predatory practices | Open Change |
However, Krechmer’s list has very serious failure. Krechmer’s list omits one of the most important factors of all: the ability of anyone to implement the standard. The whole point of open standards is to allow anyone to implement the standard, and to allow any user to have unfettered selection and switching between many implementations. Krechmer’s list notes the importance of patents and copyright (“IPR”), but his definition allows “open standards” to forbid competitors from implementing the standard. This is a fundamental flaw in his definition; defining “open standard” as “a standard that some competitors are forbidden from implementing” is nonsense, and conflicts with most other sources. Perens’ definition explicitly forbids this, as does the Valoris report, which required as a minimum that the format “may be implemented in programs without restrictions, royalty-free, and with no legal bindings.” It also conflicts with the European Commission’s definition of open standards, which also required royalty-free use (we’ll get to that definition in a moment).
The most economically obvious example of this conflict is open source software (OSS). Today, in a vast number of software markets, the dominant or #2 program is OSS, including web servers, web browsers, mail servers, and DNS servers. Yet OSS are legally forbidden from using royalty-bearing patented works, so obviously specifications requiring royalty-bearing patents or other legal restrictions preventing OSS or proprietary implementations are obviously not open standards. A standard cannot be “open” if it is illegal for the dominant or #2 supplier to implement it.
This conflict between patents and open standards only makes sense, when you think about it. The purpose of patents are to prevent competition, while the purpose of open standards is to enable open competition. The purposes of patents and open standards are fundamentally in conflict. Open standards are not the same as open source software; you can choose open standards and use only proprietary software. But selecting open standards lets you choose between implementations, including OSS, and lets you switch to another implementation later.
The European Commission (EC) has defined the term “open standards” as part of the final version 1.0 of the European Interoperability Framework. Newsforge ran a short article about it. The EC declared that “to attain interoperability in the context of pan-European eGovernment services, guidance needs to focus on open standards” -- in other words, the EC views the use of open standards as a significant policy issue. They define the following as “the minimal characteristics that a specification and its attendant documents must have in order to be considered an open standard”, and I quote them here:
The document also suggests that open source software (OSS) compliments open standards. It says: “Open Source Software (OSS) tends to use and help define open standards and publicly available specifications. OSS products are, by their nature, publicly available specifications, and the availability of their source code promotes open, democratic debate around the specifications, making them both more robust and interoperable. As such, OSS corresponds to the objectives of this Framework and should be assessed and considered favourably alongside proprietary alternatives.” Both the definition and explanatory text make it clear that the intent was that any open standard must be implementable by both OSS and proprietary programs, especially given the requirements for royalty-free use and lack of constraints on re-use.
So let’s use a definition of “open standard” merging the best of each, and so that a specification would meet all of these definitions to qualify. Comparing Krechmer’s list to Perens’, Perens’ list is shorter, clearer, and doesn’t have the serious defect of forbidding open competition. We’ll then add the two points of “One World” and “Ongoing support” stated by Krechmer as important issues.
Most of the EC’s requirements also map well to Perens’, except that the requirement for a free or nominal-cost specification isn’t explicit:
Perens | European Commission |
---|---|
No discrimination (though doesn’t explicitly require not-for-profit) | The standard is adopted and will be maintained by a not-for-profit organisation, and its ongoing development occurs on the basis of an open decision-making procedure available to all interested parties (consensus or majority decision etc.). |
(No match) | The standard has been published and the standard specification document is available either freely or at a nominal charge. It must be permissible to all to copy, distribute and use it for no fee or at a nominal fee. |
No Royalty | The intellectual property -- i.e. patents possibly present -- of (parts of) the standard is made irrevocably available on a royalty-free basis. |
Availability | There are no constraints on the re-use of the standard. |
The result is a slightly stricter definition of “open standard” than any of the three definitions by themselves. Thus, so if any specification meets this merged definition, then it’s clearly an open standard.
Here’s a definition that merges the points of these various widely-cited documents:
An open standard is a specification that enables users to freely choose and switch between suppliers, creating a free and open competition between suppliers. To accomplish this, an open standard must have the following properties:
So, is OpenDocument an open standard? Let’s walk through the list of requirements.
In short, we get “yes” answers for all of these points. All but one of them are trivially answered as “yes”. One point, however, is harder to measure - the “No discrimination” point. This isn’t because there’s a problem with OpenDocument on this point; the challenge is that “no discrimination” is harder to measure for any standard. So let’s drill into the “no discrimination” point to see if OpenDocument meets this requirement -- if it does, then it is clearly and unambiguously an open standard.
Perens requires that open standards have “no discrimination”, that is, that “Open Standards and the organizations that administer them do not favor one implementor over another for any reason other than the technical standards compliance of a vendor’s implementation. Certification organizations must provide a path for low and zero-cost implementations to be validated, but may also provide enhanced certification services.” There’s no certification issue, so we don’t have to deal with that.
The European Commission had an explicit requirement that the administering organization must be a not-for-profit, which is easy to show in this case: OASIS is a non-profit! But there’s more to preventing discrimination than simply creating a specification via a not-for-profit organization.
What about the whole first part of Perens’ requirements?
Krechmer maps this single requirement to three requirements:
We can deal with “due process” easily enough. OpenDocument was developed in OASIS, which has clear balloting and appeals process, so that’s clearly met.
The “open meeting” requirement that “all may participate in the standards development” is a little more interesting, but it seems to be met, too:
But that leaves us a mixture of requirements: “do not favor one implementor over another,” and “Consensus - all interests are discussed and agreement found, no domination.” If there’s a single vendor who controls all real decisions, then clearly we have a problem. Thankfully, I think we have good evidence that there wasn’t domination in the case of OpenDocument. If there was no domination, then OpenDocument is an open standard without question. So let’s look at the evidence, shall we?
How can you tell if there’s domination by a vendor in a standards body? There are several signs that if present give strong evidence of vendor domination. For example, vendor domination is very likely if the rules or processes controlling the standards’ development strongly limit the range of technical changes, forbid changes that would affect only one particular vendor, or give one particular vendor a sole veto power. The OpenDocument development process had no such problem; the rules in placed allowed anyone to propose changes, even if they forced any or all vendors to change their implementations. But there can be unstated rules that effectively limit others’ participation, even if there’s no obvious written rule enforcing it.
You could also check to see if other implementors involved in the process are complaining about the process locking them out, though that isn’t always a valid indicator. In this case, there seems to be no such problem. IBM’s Bob Sutor reports that, “IBM and Sun are working together happily and effectively on the OpenDocument Format. I think we’ve made a terrific amount of progress in the last year and that’s because of the broad cooperation by the community.” In fact, Andy Updegrove reports that it’s central to Sun and IBM’s strategy “to have many applications that support ODF. Remember - it’s a good thing, and not a bad thing, for there to be many different implementations - both proprietary and open source - so long as they all support ODF. That’s one of the big reasons why ODF matters - to have multiple choices (and not just one - Office), each with its own independently valuable features.”
What we need is direct evidence that there was no domination in the development of OpenDocument.
One easy way is to see if there’s only one party making essentially all the changes, or if in fact others are making proposals that cause technical changes to the specification that affect implementors. Particularly telling are changes that cause all implementors to make significant changes to their products. If all implementors are changing their product to meet the specification, then clearly no one implementor is calling all the shots.
So let’s start by looking at who proposed the original base document, and who proposed changes that were accepted. The original base document was contributed by Sun and the OpenOffice.org group, so clearly they were involved. Their base document was based on actual experience of using the format as their primary format, which is absolutely perfect... here we have a base document based on actual experience.
But did Sun and OpenOffice.org control everything, or were changes made to the specification by others? Sun and OpenOffice.org did contribute the original document, as well as some later changes, but I asked a number of TC members and they easily showed that many other organizations made substantive changes to the specification. In some cases it’s hard to tell who was the proposer or proponent, but there’s more than enough evidence to show that many others were involved. Even in the cases were I have not identified the proposer or proponent, it’s obvious that changes were made that caused implementors to change their implementations.
Below is a long list of examples of the many changes made to the original base document contributed by Sun/OpenOffice.org on the way to its becoming OpenDocument. You don’t need to read this information in detail; the very point is that it’s a long set. Anyway, here’s a table showing some of the changes:
Change | Proposer/Proponent |
---|---|
Modified to allow multiple metadata fields. The original specification did not allow list, e.g., there could only be one author for a document. At the first face to face meeting, strongly urged that this be changed, and the first OASIS draft included this change. | Patrick Durusau (Society of Biblical Literature) |
Added SVG to support vector graphics using an already-existing standard, and worked to resolve issues involving SVG use. | Paul Langille (Corel) |
Set of requirements for business charts. | Paul Langille (Corel) |
Embedded support for a subset of the XForms standard was added. XForms add the capabilities typically desired for custom schemas, which were requested by Europe, but without the horrific interoperability problems that uncontrolled custom schemas can cause. Gary Edwards reports, “It turns out that XForms is an elegant solution to the ‘custom defined schema’ problem, able to address both the “collaborate with yourself” ... model, and, the infinitely more important shared business process schema model. The binding model in XForms is extraordinary... we were quite unaware of this potential [at first]... [but once we began to understand it] oh what a treasure we found.” In particular, its approach manages to be portable among many different systems, and not tied to any one. XForms is also a standard in its own right, which is a good thing; it’s generally a very good sign if a standard tries to reuse existing standards instead of recreating its own incompatible components. | European Commission requested support for “custom schemas”; Daniel Vogelheim (Sun) recommended adding XForms. |
Attribute fo:margin was added; this improved margin handling. | David Faure (KDE) |
Numbered paragraphs/headings without number and the text:numbered-paragraph element (an alternative to <text:list>) added more flexibility, for example, numbered-paragraph is better suited for independently numbered paragraphs. | David Faure (KDE) |
Table templates were added; this makes it easier to control and modify the appearance of tables. | David Faure (KDE) |
A “sequence of page styles” and support for copy-frames was added, a new wrapping mode was added for graphics, and a desktop publishing mode was added (using the draw:page element). All of this improved support for desktop-publishing-based layout. | David Faure (KDE) |
Hyphenation became a boolean character property, for more control over hyphenation. | David Faure (KDE) |
Diagonal lines in table cells were added, providing another helpful option. | David Faure (KDE) |
Added a line style for the footnote separator. | David Faure (KDE) |
Added a number:denominator for fractions with a fixed denominator. | David Faure (KDE) |
Added a new date formatting option (number: month number: possessive-form=”true”), | David Faure (KDE) |
Added draw:regular-polygon, improving the shape-drawing capabilities. | David Faure (KDE) |
Added more document statistics (sentence-count and syllable-count). | David Faure (KDE) |
The original specification’s ordered-list and unordered-lists were replaced by a single list element, with ordered/unordered information conveyed entirely through styles. This made the specification simpler, and also made it easier to switch a list from one form to another. | - |
A per-paragraph list wrapper element | Phil Boutros (Stellent) |
Support for the SMIL standard was added as well, even though Sun and OpenOffice.org’s implementations didn’t support it. These aren’t the sort of things you add if you just wanted a rubber stamp. | - |
OpenDocument changed the older OpenOffice.org format in how style:properties were handled. Originally there was a single properties element, which contained a mix of all style properties. OpenDocument has properties per style family (paragraph-properties, etc.), which allows for a cleaner separation of properties and better handles dependencies within style families | Proposed by Phil Boutros (Stellent) |
A style:display-name attribute was added and a rule that style names had to be XML tokens was added. | Supporters included Phil Boutros (Stellent) and Paul Grosso (ArborText). |
The structure of documents was changed so that the type of document (word processing, spreadsheet, etc.) could be trivially determined without looking at the filename ending. This wasn’t in the original base document, and required a structural change, but it was added. | Phil (Stellent) recommended this. |
The original format had only footnotes and endnotes; this was generalized to notes with a general note-class value, which makes it much easier to support additional type of notes. | - |
The original format had several different frame-like elements (e.g. images or text-boxes) that shared certain attributes. This was generalized to support a single draw:frame element which would then contain the actual frame content (textbox, image, etc.) plus an optional replacement image. This eliminated unnecessary complexity (by eliminating duplication) and would make it much simpler to implement consistent user interfaces. | - |
Frames were changed so that they could list a sequence of different format options (e.g., a vector drawing and a bitmapped drawing) so the viewer/processor can choose the ‘best’ format that it can support. | - |
OpenDocument formulas now use a “namespace prefix” to identify the formula language used. This makes it much less likely that the wrong meaning would be applied to a given formula. | - |
Lots of other small changes. The style:default-outline-level attribute for heading styles was added, making it easier for users to use templates. The original style:properties was refined into style:paragraph-properties, style:text-properties, and so on. Also, page-master was renamed to page-layout for clarity (and to avoid mixing with master-page). | - (David Faure thinks that he commented that the name page-master was confusing, and that Michael Brauer (Sun) then offered to rename it.) |
I am sure that there are many other changes not listed above, but I think I have enough listed here to make the point: this specification was a work of many, not a specification controlled by any one vendor.
Various other changes were made to the document to improve it as a document, too. A big change was that the schema definition language changed to RELAX-NG, a standard specification language which is more powerful than the older DTD language yet is very easy to understand. One advantage of this additional power is that automated validation tests of OpenDocument files using the specification can be much more rigorous. It’s also more likely to result in interoperability; RELAX-NG can be much more specific about the permitted values in constructs, eliminating many potential misunderstandings by implementors.
Clearly, there were many changes made after the initial submission of the base document, due to interaction of all the members. The first draft of the original specification was about 108 pages and its schema was 69K. The final version is about 723 pages and its schema is 529K. This growth to a full standard happened through careful review by a large number of experts from different backgrounds.
A risk of making many changes to a specification is that the result may be too hard to implement. The OpenDocument process avoided this problem by having implementors actually implement the changes as they went, and in particular used multiple open source reference implementations as proving grounds. One participant commented, “it was clear that the proprietary [vendors] had gone into the source code or at the least downloaded [OpenOffice.org], and were studying the implementation and methods. ... Explanations, rationale, and techniques were exchanged in ways that would be impossible if not for a common reference application and code base everyone had access to.”
I stated above that the most important measure was to see if there were changes made from many different sources, and I think I’ve proven that very conclusively. But for some additional evidence, let’s look at the participants themselves -- do we see the multiple implementors, multiple users, and diversity of views that we’d expect to see from a standard not dominated by any one organization?
Once again, I think the answer is “yes”. The OASIS website gives lots of details; Wikipedia summarizes the people involved in creating the OpenDocument standard.
First, a quick background. Version 1.0 of the OpenDocument specification was developed after lengthy development and discussion by multiple organizations. The first official OASIS meeting to discuss the standard was 2002-12-16; OASIS approved OpenDocument as an OASIS standard on 2005-05-01. This is about two and a half years of hard work; it’s not easy to create a good specification.
The standardization process included the developers of many office suites or related document systems, including (in alphabetical order):
That’s a lot of implementors, some of whom are direct competitors, and they represent a very broad range of concerns. That’s a very good sign, when competing implementors work together on a standard.
Another good sign is that there were users with specific issues who were involved:
My personal favorite is the “Society of Biblical Literature”, because they’re so unexpected -- I would never have thought to invite them! Yet this group worries about dealing with large multilingual documents in any living language as well as many long-dead ones, and they worry about long-term retrieval issues in terms of millenia. The National Archives of Australia was represented, too. If their needs are met for internationalization and long-term retrieval, then my needs will be met too.
Another way to show that they really had diversity is to see the different goals of different groups all getting met. Michael Brauer, chair, insists that the only objective of the group was to support “Desktop Office Suite work”. Others, such as Gary Edwards, believe OpenDocument really covers three areas:
Gary Edwards has claimed that Boeing was actually primarily interested in the second area (SOAs), for example. Interestingly enough, there’s no need to declare which set of goals is “right”. Different organizations joined the group because they had different goals, unsurprisingly, yet in the end all believed their (different) goals had been achieved. That is very good news! A standard that can meet the different goals of different organizations tends to be a good standard as long as it’s implementable... and since it’s been implemented, there’s no question there.
Also, the fact that Boeing wanted the specification to be so good that it could be a “universal transformation layer” is excellent evidence of OpenDocument’s capabilities. To fulfill this role, OpenDocument has to be so expressive that it’s able to capture the information from many different document formats, not just Microsoft Office or any other single format. The result is a more general and flexible standard.
Although it’s not part of the definition of “open standard” I can’t help but be impressed by the expertise of many of the participants. Tom Magliery (Corel/XMetal) was on the original W3C XML 1.0 working group (as well as the original NCSA Mosaic browser team). Phil Boutros (Stellant) is an expert on Microsoft’s file formats. Paul Grosso (founder, ArborText) was chairman of the XSL:FO effort. Doug Alberg (Boeing) well-understood the needs for enterprise publication and content management software systems. Patrick Durusau is a well-known SGML expert. And so on. This is why it’s a good idea to create standards through an open process -- by gathering the experts together, in a way that lets them truly fix problems they find, a group of experts can develop a very good result.
Since this original document was developed, the Digital Standards Organization (Digistan.org) has stood up and developed a much simpler definition for open standard (aka "free and open standard"): "a published specification that is immune to vendor capture at all stages in its life-cycle". More specifically, they define an open standard as one meeting the following criteria:
Note that this specifically makes clear that it must be available at no cost, and that it must implementable irrevocably on a royalty-free basis. Although I didn't specifically analyze OpenDocument against this particular definition, it should be clear that OpenDocument meets it quite handily.
Indeed, the Software Freedom Law Center published an "OpenDocument Opinion Letter" , which analyzed the various OASIS and Sun license terms and found that they were fine. In particular, "Under the relevant OASIS patent policy, all Essential Claims held by OASIS Technical Committee Obligated Members are available to all implementors of ODF on terms compatible with free and open source software licenses [and] Sun’s license terms for access to its Essential Claims are fully compatible with free and open source software licensing." This is in marked contrast to its report on OOXML; Software Freedom Law Center's Analysis of Microsoft's Open Specification Promise says that the Microsoft Promise Provides No Assurance for Developers.
Without doubt, OpenDocument is an open standard. It meets every requirement of a rigorous (merged) definition of “open standard,” with lots of evidence. In particular, there is significant evidence that it was not controlled by any single vendor.
And that’s a good thing. When I send a pure text file, nobody asks “did you create that using WordPad, vim, or emacs?” When I send a bitmapped PNG file, nobody asks “did you create that using the GIMP or Paint Shop Pro?” Why? Because it doesn’t matter. Data formats that are fully open standards can be implemented and understood by all. By using them, I’m free to use whatever tool I want.
Governments have started to realize how important open standards are:
In short, open standards create freedom from control by another. That’s a freedom we should all experience.
My thanks to the many people who helped me find the information in this paper, including Daniel Carrera, Patrick Durusau, Gary Edwards, J. David Eisenberg, David Faure, and Daniel Vogelheim.
This article released under the Creative Commons “Attribution-NonCommercial 2.0” license. You can find this article on Groklaw; this article is also available on David A. Wheeler’s web site.
See also Rob Weir's "The Recipe for Open Standards (and Why ISO Can’t Cook)" (September 9, 2010).