Introduction

Large scale, modern computing services, such as cloud based operations depend on thousands of individual components to operate as desired. This places a huge strain on the system administrators who are responsible for these systems. Even smaller production networks are becoming increasing sophisticated and disparate – and that means increasingly difficult for us to manage.

Imagine for a moment, you find yourself with the task of deploying a bunch of new server systems for varying purposes. Let’s say these are all based on a similar base config but that some will be provisioned as web servers, some as file servers and maybe some will deal with mail. Despite these differences, there would be a lot of overlap in the configuration which has to be carried out to get these systems up-and-running.

Now there is a few ways to approach this problem. Traditionally it might have been reasonable to do this config manually, maybe it would even fall in the laps of different teams. Though remembering that the config is going to be similar and repetitive would probably make you think that automation of some sort is the way to go – and you’d be right. Let me add that manual config of this scale is error prone and it’s very easy to mistype a parameter or forget a step which could leave systems in a non-desired state or even vulnerable to security exploitation.

Assuming you’ve now ruled out the idea of doing the config manually (and I hope you have) the next logical method is basic automation – using some sort of scripting language (say bash, powershell etc). This may work well, you’ll be able to write, in a nice order, the steps necessary to configure the systems as you desire. Errors will be avoided and the whole process will be easier. The problem with scripting the deployment usually arises when either the scale of the operation increases to a difficult level to manage or when the parameters become complex and interdependent. Problems also arise when package version change and the scrips used for initial deployment will not be suitable to maintain the state in case of a failure or replacement of the system (ie. the parameters in the script will be outdated soon).

Configuration Languages

This is where configuration languages come in. Very basically, a configuration language is a specially designed language used to control how systems and system resources are configured. The language is what the syntax is written in and a language “engine” interprets the code and carries out the relevant action. The point is to automate much of the configuration and management of the systems and to do so, ideally, in a more dynamic way. These languages usually aim to be system independent and can apply the same configuration to a group of systems with different operating systems and hardware, which is a big advantage compared to the scripting method we discussed earlier.. There are many configuration languages in existence today and they all work (some very) differently. The kind which i am most interested in are called declarative languages and a prominent example is Puppet.

A declarative language deals with things in a different way from traditional scripting or programming language syntax. Instead of describing how to do something, as with a procedural language like C, declarative languages describe only what a system should look like – not how to make it look like that (which will be different on different operating systems). This makes the syntax easier to understand and the language more able to manage a diverse range of systems and resources. The truth is the many of the exiting configuration languages are not purely declarative or procedural but they fall somewhere between the two, incorporating features of each.

Going back to our example of the servers, to manage this deployment with a language such as puppet we would simply have to configure a file describing what we would like the systems to look like. This would include information about what packages to install, what files should contain which parameters and what services should be running and when they should be started – in Puppet this is referred to as a manifest.

There’s a few different ways a system such as Puppet can work, which I won’t go into here. The key thing I want to emphasis is that once we have our configuration written, we can apply it to all of our systems and not only will the initial deployment be handled, but those systems will be dynamically managed by Puppet from then on. Meaning that the system periodically checks if the desired system state matches the current system state and if for example, a service is described as enabled but is actually disabled, it will be restarted. This makes it very easy to make changes to all systems and constantly ensures that those systems match the current set of desired configurations.

I’ll be writing some more specific posts on the Puppet language and it’s specific features but for now I hope that gives some insight into what a configuration language is and how it can be useful to administrators