Configuration file options

For quackjson (a Python implementation of a BBS client) I needed to create a configuration file which allows the user to specify their login credentials and how they want the client to work. As far as I’m aware, there are three main markup languages which can be used (and often abused) for data marshalling and configuration files: XML, YAML and JSON.

XML

XML has the advantage of universal support, in that most modern languages will have a parser available as part of the standard library). XML also has what I call the ‘Windows advantage’ – i.e. almost everyone has heard of it and knows (or at least thinks they know) how to use it. This doesn’t mean that the item in question is the technically superior option or that it is the best solution to every problem, but it is well known.

As a simple example, if I was to say to an external developer in the financial services sector “I’ve built a REST API which uses JSON for data exchange”, they’d probably give me a quizzical look and think I don’t know what I’m doing. On the other hand, if I say “I’ve built an XML Web Service”, they’ll immediately know roughly what I mean.

However, a major disadvantage of XML is that it doesn’t map nicely to basic data types, such as dictionaries in Python or arrays in PHP. Even once you’ve imported the XML from a file, you still have to treat it like XML, which feels somewhat dirty — especially when this data might be passed into other modules which have to be XML-aware. XML is also criticised for being bloated, although this isn’t a major problem with a small configuration file which lives on disk.

YAML

YAML is in some ways a simplified version of XML. Most of the syntax has been removed or simplified, making it much easier to both read and write. It’s perhaps not as widely used as JSON or XML, but I would argue that it occupies a middle ground between them in terms of power and complexity.

The big disadvantage of YAML against XML is that there is no simple way for a document to self-validate (with XML you can define a schema which documents must adhere to). This isn’t a major problem for simple configuration files though, especially if you use a library which can check whether given elements exist or not.

JSON

JSON is what I initially used for configuration files, as support for parsing it is built-in to Python (import json and you’re done), in addition to other languages such as PHP. Also, all the message data between quackjson and the server is using JSON, so it seemed sensible to stick to a single format for both configuration and data marshalling.

However, JSON has one critical disadvantage when it comes to configuration: comments — or rather a lack thereof. That’s right, you can’t add comments in JSON. There are hacks you can employ — such as using _comment to denote a comment — but these are ugly and have no semantic meaning to the parser. Of course, you can argue this from either side, one being that JSON is only intended for use in data marshalling, where comments are not required, or that people will always find different uses and not including comments demonstrates a monumental lack of foresight.

In the end I went for JSON, as it was the easiest to implement. For more complex projects where the configuration file requires comments, I’d definitely go for YAML.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.