Log in

No account? Create an account

Previous Entry | Next Entry


I've always complained about XML especially as a network protocol. Now at work I am having to actually use it, and it is every bit as bad as I feared. I think XML is a bad idea, and for network protocols it is even worse. Below I list the commonly listed benefits of XML, and debunk them in relation to protocols:

  1. Human readable - So. The protocol is for two machines to communicate not two humans. People often claim it is easer to debug if the format is human readable. In reality when you are debugging a protocol you step through your parsing code in the debugger.

  2. You can use off the shelf parsers - Yes, I could, but have you seen the size of those things. I can write a binary parser in the time it takes me to understand an XML parser, and the parsers themselves are not standardize. Each one has its own quirks and behaviors, even ignoring SAX vs DOM.

  3. It is a Standard - Yes XML is a standard. I agree. But what it standardizes is the format of the protocol. Not the protocol itself. Your code will still have to understand what the tags mean, and that means an avalanche of string processing is about to descend on you even after you parse the XML. And we all know how fast that is. Additionally XML does not provide any kind of mechanism for a message header if you are trying to use it as a protocol. This makes the socket layer code extremely complex. Almost all XML based protocols use a homemade header of some-sort. Not very standard now. Additionally most XML I have seen is not truly valid XML as it has no DTD and no XML header (not to be confused with a protocol header). It is mostly XML like.

  4. XML is what "The Web" uses. - No "The Web" uses http. Not my favorite protocol since it uses fairly complex variable length headers, but at least it is truly standard.

  5. Binary is too complicated - Not really. In fact if you really understand what is going on in the computer you will realize it is all binary even XML. What people perceive as complex in binary is typically datatype size and Endianess. XML doesn't avoid this. It defines it in the XML specification. Any worthwile protocol will also define these, and any decent programmer can understand both concepts, if they don't already, in about 5 minutes time.

  6. XML is flexible - So is binary. Hell binary is so flexible you can write XML in it. If you cannot design your protocol to be extensible without XML, you need to think about the problem more.


( 6 comments — Leave a comment )
Aug. 30th, 2006 08:53 pm (UTC)
I'll bite. On benefit #1.

Sometimes it's good for a format to be human readable. Suppose you create a format for files that can be used to configure the snazzy MP3-player program you just wrote. If a user looks at a few example configuration files, he can easily create a new configuration file that your program will be able to use. If the configuration file is in binary, he'd have a serious reverse-engineering job on his hands.

I think this is why HTML is so successful--you can easily create new pages by copying and tweaking existing pages.

This is an argument for using human-readable text (in some cases), not an argument for XML per se. XML is just one way of many ways to represent name-value pairs and structured data. LISP would be another. Hey--why didn't LISP catch on as a network protocol? It can be both data and program.
Aug. 30th, 2006 09:05 pm (UTC)
Only five more reasons to go. :)
Seriously though I was specifically speaking about network protocols, where it really does not make a lot of sense. Even for configuration files I would not use XML. I would much rather use a name value pair scheme, such as:

Still I could be talked into accepting XML as a configuration file, but I would never implement it myself.

I agree with your comment about HTML, and being human readable did make it catch on. However, HTML is truly a markup language, and people don't try to make it be anything else (well most people don't). XML is not a markup language despite the name, and it seems that everyone thinks of it as the magic elixir. If we use XML for our product it will be better for it, but no one can tell you why.
Sep. 1st, 2006 06:26 pm (UTC)
When I was a kid we had to write our protocols in binary. And we liked it! We didn't even have computers. We just wrote our messages on long paper tapes in 1s and 0s, and then the other guy would take the papers and interpret the 1s and 0s. And we liked it!
Sep. 1st, 2006 07:04 pm (UTC)
If had a choice between 0s and 1s or XML I would choose 0s and 1s. :)
Sep. 6th, 2006 05:13 pm (UTC)
One of those
XML is one of those strange technologies that is simultaneously a powerfully simple idea and ridiculously overhyped. (Of course, they try hard to do away with the "simple idea" with things like XML Schema. I'm with you -- if you're going to turn a simple data storage syntax into an overly complicated thing, with huge support libraries, why not just use binary?)
Apr. 6th, 2008 03:42 am (UTC)
What i can find on this forum ? I new here
( 6 comments — Leave a comment )

Latest Month

July 2011
Powered by LiveJournal.com
Designed by yoksel