EXI: The last binary standard?
A few days ago, March 10, the W3C published a new XML standard for transferring XML data, the Efficient XML Interchange standard which promises a very significant reduction of bandwidth used and improvement in performance. In this blog post, I'm going to give a shot at providing an overview of the new EXI standard, its history, current status and application, and, of course, the actual performance benefits.
But first, a bit of background. If you're a developer, you will know XML. If you're a webdesigner, you will know a subform of XML, XHTML or its close relative HTML. If you're Average Joe, you will most likely have heard of XML. XML is basically a standard for representing information in a hierarchical fashion - a Document containing Elements which, in turn, have Attributes or 'nested' Elements with more attributes or contents.
Typically, this data is represented in an XML document, a plaintext set of characters where elements are represented by bits of text surrounded by closing- and opening tags, < and >. An example (and very simple) XML document represented as plaintext would be:
1 2 3 4 5
<document> <element attribute="value"> <nested-element>text contents</nested-element> </element> </document>
The above document can be read and manipulated by Man and Computer both, as on the one hand it's human-readable due to the readable characters used, and on the other it's structured and based on a standard, making it predictable and, as such, readable by computer programs. There's other means of storing an XML document for more efficient reading and altering by computer programs, such as the Document Object Model, but the above is what most people will know best.
There's a major problem with this method of representing XML data like this though - there's a huge amount of overhead in terms of the size of it. It takes ten bytes of data just to represent that the 'document' element starts (using the <document> tag), and eleven more to represent the end of the 'document' element.
This 'overhead' grows even more significant if you consider frequently repeating tags, such as the 'element' tag, where hundreds or thousands of them might be contained in a single XML document. The above XML fragment is 116 bytes, while only 13 bytes of actual data is contained in there (the 'value' attribute and the 'text contents' text node).
This overhead is hardly a problem in trivial applications, but it becomes highly significant when you're dealing with bigger and / or more frequently accessed XML data. Take the Tweakers.net homepage, for example. At the moment, it's about 80 kilobytes of data (just the HTML), which transfers about 7000 characters worth of readable data to the user (estimate, I just copy/pasted the whole page into a text document to test).
With millions of pageviews per day, the overhead caused by this method of representing XML data becomes very significant - and Tweakers.net is actually just a small player, if you compare it to the truly massive international websites and other applications where XML is used today.
There are methods to greatly reduce some of the overhead in transferring XML, for example GZip, employed by almost all websites. This reduces the actually transferred size of XML by a significant percentage, usually more than 50% for HTML websites, 80% or more being possible. This is largely dependent on the application, though - small XML documents, such as the example above, could even be larger in size when compressed with GZip. This is in part because GZip is not specifically tailored for XML - it's a general-purpose compression algorithm. Of course, this is hardly an argument GZip works, and it saves a lot of transferred bytes.
It doesn't, however, save on processing power. On the 'generator' side of the transfer of an XML document, the XML still needs to be generated. On the 'receiver' side, for example your browser, the XML document still needs to be converted from a character stream into a Document Object Model, which in turn is turned into a visual representation of a website or whichever data you're looking at.
GZip doesn't help with this, as it actually adds a bit of overhead in compressing and decompressing the XML document. To the user, this overhead is hardly noticeable with computers being as fast as they are today, but if you take it all together, it's a very significant process.
It didn't take long for the people to find out that XML isn't an ideal format in environments where high transfer speeds and low overhead are required. That's why XML never did really replace existing or new of the so-called binary formats, as they're far more efficient and have hardly any overhead. XML does offer other advantages, however, so the one doesn't exclude the other.
In any case, on towards the main point of this tl;dr post. A few days ago, the World Wide Web Consortium (W3C) published a new standard called Efficient XML Interchange that promises to alleviate both disadvantages of XML: Raw size, and processing power needs.
According to the newly published EXI standard, it does so by creating a whole new XML document interchange standard, which isn't based on first converting the Document to a character stream to be handled by a consumer, but by passing the information as a stream of 'events', encoded in a binary format. For example, given the example above:
1 2 3 4 5
<document> <element attribute="value"> <nested-element>text contents</nested-element> </element> </document>
Based on my current interpretation of the standard (which isn't high, only flipped through it quickly), this document transferred through EXI becomes a set of 'events' that tell the client what the document contains, a set of messages that effectively go something like:
1 2 3 4 5 6 7 8
Element 'document' starts. Element 'element' starts. Attribute 'attribute' contains 'value' Element 'nested-element' starts Text-node contains 'text contents' Element 'nested-element' stops Element 'element' stops Element 'document' stops
This would actually be familiar to (mainly Java) developers that have ever worked with the Simple API for XML (SAX) to parse an XML document. It processes an XML document by providing the SAX parser with an object that handles events, not unlike the events encoded in an EXI stream. The difference, however, is that SAX and other XML parsers initially process a character stream, that contains the XML document encoded as a set of <tag>s.
In any case. The EXI standard specifies how information can be passed as a stream of events, thus largely reducing the overhead caused by representing that data as a stream of plaintext characters. It would, in theory, be possible to convert an EXI stream to a human-readable XML document, as the actual information and hierarchy of it doesn't change - therefore, EXI retains the advantage of XML being human-readable.
(Edit: I should look around a bit more before starting to type. There's an EXI Primer written by the W3C last year that provides an easy-to-read and comprehend overview of what EXI is, how it works, what it looks like, etc.)
Another major advantage of EXI is that it's compatible with the XML Information Set, which is effectively what the character representation of an XML document is based upon. This means that existing XML parsers will be able to handle XML data contained in an EXI stream with no adjustments - all that's required is to provide a new input source that triggers the events in the XML parser provided by a developer. This is especially true for SAX, as it's already based on 'events' indicating when an element starts. Replacing the bit of code in that library that reads an XML document encoded as a character set with a bit of code that reads an EXI binary stream and simply passes the events over to the user-defined handler should almost be a trivial matter.
The EXI standard itself doesn't specify how much it'd actually save compared to the classic form of representing XML. However, the W3C did do performance tests, and very thorough ones at that. In the EXI evaluation document, they first compare the 'compactness' of a set of XML documents transferred as plain old XML text with the same XML text compressed with GZip and transferred using EXI.
As with GZip, the amount of bytes needed to represent the document is in some cases lowered with as much as 99%, and frequently above 80%, the least compressable document still being reduced by almost 40%. However, as you can see, representing the data as EXI is far more consistent in terms of compactness than GZip. This is in part due to GZip working best on large documents with 'repetitive' words - as the document indicates, GZip has trouble with XML documents containing many short messages, such as geolocations and sensor readings.
Compactness: EXI wins.
As EXI reduces the overhead of having to read an XML character stream and having a parser try to make sense of it - often replicating what EXI already does - it's expected that EXI of course improves XML processing speeds too. And this is correct, according to the processing speed benchmarks, comparing plain old XML parsing with parsing using EXI.
In all but one of the test cases, EXI was faster, from 75% faster to 2500% faster in one specific case, but usually about 6.7 times faster.
The extreme speed boost was gained in a case where an XML document contained repeating structures with elements and attributes from several namespaces, which in classic XML handlers causes a lot of overhead - but not in EXI, which largely eliminates namespace overhead.
To my own surprise, the speed boost is less extreme when compared to GZip compressed XML document. I'm not sure why this is, might be that throughput of the bytes is faster when GZip-compressed, and that the actual size of the document is more of an impact on performance than the actual contents are. But I guess other people that know more about GZip and the internals of an XML parser have the answers.
Naturally, EXI should also be faster when encoding a document. Encoding a document is effectively converting some data into an XML document, then converting it to an XML document character stream and sending it. If the conversion step can be skipped, then a speed boost is obviously to be expected.
The differences here are not as extreme, but still significant. In some cases, it's actually less efficient than the XML encoder they used (Efficient XML by AgileDelta), but in the vast majority of cases, it's faster, about 2.4 times faster than the XML parser. (I use the median speed increase figure here, the average (of 6.0) is raised disproportionally by the 21-times faster extreme.)
EXI is a bit faster when compared to compressed XML document generation, probably because it doesn't have the overhead of having to compress the result. Another important point here is that GZip compression actually has to wait until the entire XML document is generated (the plain-text, intermediate form), to apply effective compression. This increases memory usage (or, keeps certain data in memory for longer), and from the client's side, increases time it has to wait until it starts to receive data. EXI doesn't have this disadvantage, as it can start sending bytes (and the client can start processing them) right away.
EXI is awesome. It might finally solve the downsides of using XML, those of high (data) overhead and suboptimal processing speed, which is especially important if XML has been chosen as the format to use in high-volume environments. The W3C isn't shy to name big, important-sounding use cases such as the military and real-time trading systems, but those are exactly the environments that cause high volumes of data that need to be processed quickly. I'm not actually sure if those environments would actually have used XML with the downsides it has, but I guess that if they do, the upsides outweighed the downsides.
Another upside for EXI is of course the environment, as the W3C smugly announces as one of the main advantages of EXI over plain old XML transfer systems. Each byte saved, each processing instruction saved, is cheaper on energy, especially in high-volume environments.
Implementing EXI should also be relatively painless, as opposed to converting an XML information sharing architecture with a binary format, since existing parsers can handle EXI with only a few adjustments.
I think this is actually a big step. Attempts were made earlier to convert XML to a more efficient transferrable structure, but this one seems to be the best to date. It should be, of course, after years of development by big stakeholders and smart peoples. Because it's defined as an official standard, it should be able to implement it in existing architectures rather quickly.
Applying it to the web would most likely require some changes in webservers. It would be possible to install a webserver modification that converts the HTML code spewed out by the webserver into an EXI datastream, which would already provide a performance boost, but it'd have the same downside as GZip compression - the 'intermediate' format would first have to be generated and processed, i.e. plaintext HTML. The webserver would have to be altered at a more base level to return the data as an EXI stream.
(Web) applications too would have to be changed. Most webapps currently spew out plain old HTML, using templates to put their data into HTML to be output. If EXI is to be applied to the web, templates would either have to be completely abolished in favor of generating the output as an abstract XML document (the DOM comes to mind again), or would need to be changed so they read the template and output it as EXI. Of course, this is running in circles, as a system like that is effectively an XML parser that outputs the same data as EXI. I'm not sure how effectivey EXI could be applied to the web.
According to the evaluation summary, however, EXI is transport independant, meaning it can be used "over TCP, UDP, HTTP and various wireless and satellite transports."
In any case. I read the FP article on this and was intrigued by it and the responses, read part of the specification and related pages, and decided I'd just dedicate a post on the subject. As a disclaimer, I'm no expert, I haven't thoroughly read the W3C documents, have never written my own XML parser, and am not very deep into the whole XML story, limiting myself to relatively simple applications of XML.
However, I do know enough to realize that this is a significant development, and might possibly be the most important step in the development of XML since it was conceived. With EXI, XML documents can now be transferred quickly and with low overhead in both transfer size and processing speed, finally making it a worthy adversary to binary formats which, despite their obvious advantage, still have the disadvantage of often being inflexible and difficult to process.
As a 'public' format, binary formats never really got off the ground, as processing them isn't nearly as easy as parsing XML is (or can be). No surprise that XML has long been the format for public (web)services and data. With EXI, these XML providers can now (theoretically) be used with much greater efficiency, making EXI yet another good step in a long line of performance improvements in the IT world.
Actually, EXI isn't the first attempt at creating a binary means of transmitting XML data. Abstract Syntax Notation One is a protocol for transmitting binary data, which has a so-called transfer syntax which can be used to transfer XML data (XER). In the evaluation, the W3C working group on EXI did compare the performance of EXI versus ASN.1. As the graph indicates, ASN.1 didn't do much in many cases, or provided inferior size reduction compared to EXI.
A more specifically aimed at XML transfer, and in its definition largely comparable with EXI, is Fast Infoset, which is built on top of ASN.1. I don't see anything about that in the evaluation, but since it's built on top of ASN.1, I guess it's safe to assume the results would be the same.
As far as I can count, the combined length of 'value' and 'text contents' is 18 characters.The above XML fragment is 116 bytes, while only 13 bytes of actual data is contained in there (the 'value' attribute and the 'text contents' text node).
For the rest, great article. Definitely worth reading.
still, good call.
@EdwinG: Sorry, I tried to manually count bytes, guess something went wrong there. The example was a trivial one, though, I probably shouldn't have gone into so much detail there.
Thanks for the compliment.
But I don't think one excludes the other. If you have an application in which JSON is good enough, it should be easy enough to convert it to the more efficient BSON. If you've got a more 'official' application (note: 'official' definition omitted on purpose), I guess XML is more the way to go. But the choice between XML and JSON, Protocol Buffers, BSON, or a homemade binary format, should be done on a per-project basis based on that project's performance requirements.
@Skinkie: You're right, this is also why I explicitly mentioned DOM and SAX, the former being an in-memory representation of the XML infoset, the latter being how this method of processing XML isn't anything new. Why before now no successful binary format for XML data was invented eludes me. I guess, as an edit I did on the ASN.1 protocol, that they tried but were unsuccessful in that attempt.
[Comment edited on Sunday 13 March 2011 20:43]
Not quite true; they're all derivatives of SGML.If you're a webdesigner, you will know a subform of XML, XHTML or its close relative HTML.
The DOM is "built up" from the information contained in the document, it is a representation of (parts of) the document and gives you handles to interact with elements in the document; the DOM is not a means of storing the elements. Or, as Wikipedia states it:There's other means of storing an XML document for more efficient reading and altering by computer programs, such as the Document Object Model, but the above is what most people will know best.
The Document Object Model (DOM) is a cross-platform and language-independent convention for representing and interacting with objects in HTML, XHTML and XML documents.
This is typically referred to as "payload". There's the metadata (describing the data) and the actual data itself.The above XML fragment is 116 bytes, while only 13 bytes of actual data is contained in there (the 'value' attribute and the 'text contents' text node).
If you're 'talking' to, say, a browser then, yes. It's gonna need the DOM anyway. But a lot of applications do not have a need at all for a DOM; that's where SAX etc. comes in to play (which you mention later on).the XML document still needs to be converted from a character stream into a Document Object Model, which in turn is turned into a visual representation of a website or whichever data you're looking at.
I don't see the point in this; For the web the standards we have are fine. The only "web" place I can think of (consumer-wise) that EXI would have advantages would be mobile devices etc. because of their limited bandwidth and devices like PLC's etc being limited in either processing power or memory. EXI definitely will have it's place but (not having read the specs yet) I do feel it is intended for either "limited devices" or high volume data exchange like, as said, Stock Exchanges etc.; not so much for mainstream websites. It would be possible, I guess, to "move" gently into an EXI era but my guess is that it wouldn't be worth the effort (envisioning 'transition' decades etc... nah... )Applying it to the web would most likely require some changes in webservers. It would be possible to install a webserver modification that converts the HTML code spewed out by the webserver into an EXI datastream, which would already provide a performance boost, but it'd have the same downside as GZip compression - the 'intermediate' format would first have to be generated and processed, i.e. plaintext HTML.
So is (X)HTML, OData, JSON and you name it.According to the evaluation summary, however, EXI is transport independant, meaning it can be used "over TCP, UDP, HTTP and various wireless and satellite transports."
I think you should take a step back and take a good look at the formats you're currently using. All image formats (Jpeg, Gif, Png, Bmp, Ico, you name it), all video formats (Avi, Mpeg, H264, WebM, etc.), all executables and so on and so on are binary formats. It is also not true that they're inflexible (they can be as flexible (or even more I guess) as SGML) and they're not hard to read at all (if properly documented etc. It actually takes the whole "parse" step out, most times you can read a crapload of bytes and just throw them in a struct or whatever; there's your parser: 0 overhead). Also a lot of document formats have only since 'lately' been phasing into the "readable" realms; these also used to be binary and have always worked fine. That it wasn't very open due to MS I'll touch the subject because I know you people can't let it pass by without mentioning MS has nothing to do with a "flaw" of being binary. It just wasn't very open (and even that has been changed lately as MS is opening up their documentation about these legacy formats) and it also wasn't "just MS" who was guilty of this.still have the disadvantage of often being inflexible and difficult to process.
As a 'public' format, binary formats never really got off the ground, as processing them isn't nearly as easy as parsing XML is (or can be).
I do applaud EXI and W3C's efforts (and your blogpost) but I do not think it's "the next best thing since sliced bread" (yet )
[Comment edited on Monday 14 March 2011 00:41]
Comments are closed