XML is quite common nowadays, especially in the application integration business that I am involved in. However, I still see companies making big mistakes when they decide to start using XML (for example as the exchange format with their business partners). This series of posts is about mistakes (or at least clumsiness) in using XML that I noticed during several projects.
Only change the syntax of the CSV file (no normalization)
With this I mean that there has been existing interfaces based on character separated values like:
car-height|car-width|driver-name|driver-gender
And when they start to use XML they change it to:
<row> <car-height>xxx</car-height> <car-width>xxx</car-width> <driver-name>xxx</driver-name> <driver-gender>xxx</driver-gender> </row>
More readable would be something like:
<row> <car> <height>xxx</height> <width>xxx</width> </car> <driver> <name>xxx</name> <gender>xxx</gender> </driver> </row>
The fields concerning one entity are now grouped together, like you would do when normalize a relational database. This not only increases human readability but it also increases the chance to get to a reusable schema. With this latest example you can imagine that you can use the ‘car’ element also in another scheme.
So the tip of the day is: perform normalization in your XSD where possible.