THIS IS A DRAFT! It will appear on the main page once finished!

Python is a decent config file format

Draft stage: mostly ready in terms of what I want to say and structure, writing needs to be improved.

TODO don't really like the name, do smth different

I've decided to write more about some TODO I like Python, so here's we go.

(Unpopular?) opinion: Python is a perfect, if not best language for writing config files.

Why most config files formats suck:

  • don't have comments (e.g. JSON) TODO apparently even deliberately???
  • can't be included without extra code that has to be implemented by the porgrammer, and there is no standard for that
  • can't have any logic

    That would be considered as a positive by many, but I would argue that when you can't define temporary variables, functions, substitute strings or concatenate lists, it's fucked up. This is a somewhat solved problem in computer science, and you can totally have resonable subset of language.

  • can't be validated; except for syntax, by your text editor. Kind of a consequence of not having logic.
  • YAML TODO just add one of links criticizing it, there is enough of it on the internet already 1.1 vs 1.2 https://yaml.org

So what happens? Often configs end up generated by real programming languages anyway.

  • you use real programming language to filter out your custom comment syntax
  • you use real programming language to merge the configs
  • you use real programming language to 'evaluate' the config (often essentially reimplementing simple functional language)
  • you use real programming language to validate (and end up with mediocre error messages as a result)

None of these things are pleasant and they all distract you from your objective. You kind of see where I'm coming with this. I suggest that Python is extremely well fit for writing configs. TODO (even when you're writing a c++ program??)

How using Python files solves these issues?

  • comments: duh
  • includes: use Python means of importing modules
  • logic

    You have all Python syntax (and libraries) at your disposal. Of course, one could go crazy, but I'd rather accept potential for abusing rather than being restricted.

  • validation

    You can easily validate at the time of module loading. You have all the existing Python tools at your disposal (pylint/mypy)

TODO interoperabiliby: one common argument is that if your program ever migrates from Python (e.g. you rewritten it in C++ to speed up), you're doomed. Good point, but I don't buy it. Python is present in most modern OS distributions I suggest that it's at least not worse than using something like json in the first place: it's trivial to emit a json file from your python config to consume by C++ code.

TODO performance: parsing vs evaluation?

Extra benefits

  • using python's data classes and functions and generators – you can construct a small DSL for your configs datatypes like datetimes, currencies, etc
  • using mypy typing annotations – serves as self-documentation and validation at the same time
  • TODO e.g. config with secrets – can keep login code close?

What's the worst that could happen?

Security? E.g. attacker injects arbitrary code. TODO wow! https://www.arp242.net/yaml-config.html#insecure-by-default

Yaml syntax: sucks https://www.arp242.net/yaml-config.html#surprising-behaviour

or lower privileged users change config and execute code as higher priveleged (kinda questionable setup anyway) Sandboxing? Actually research it, perhaps it's possible. Infinite loops etc?

TODO some config files end up being Turing complete anyway (e.g. mail filters?)

If you're truly paranoid, it's possible to lint it before loading or perhaps sandbox. (disclaimer: I never actually tried it, so perhaps it's harder than it seems. obviously, halting problem kicks in etc) TODO Skylark/Bazel is a good example of that approach.

Python ecosystem is a good example of how fucked up are config files.

E.g. you have to write code in setup.py when you can't force setup.cfg do what you want. Trivial example: duplicate dependencies. Although to be fair, at least they offer hybrid approach, which means that one can use it the way they are comfortable with.

TODO examples of systems that use python for configuration files?

1 TODO Integrate

TODO [2019-12-04 21:00] someone surely must have written about it? link perhaps?