Brian Olsen : Managing servers with a bunch of scripts and `make`

Managing servers with a bunch of scripts and `make`

2023-10-13

... or what I am jokingly referring to as a "poor-man's Ansible". :D

I have a little project for myself to review a set of servers and see what files I manage. I started to learn Ansible specifically to see if that could be useful, but I decided that it is too big to solve what can be done carefully in a few scripts.

If I am managing files on servers that do different things, then I should make some logic around it, like "a server of this type in this environment gets these files." Then, uploading a few files is not expensive, but we want to keep an ongoing log of those changes and only update if those changes should be done.

It does sound like something that could be done with Ansible: I have a server inventory, it can be described programmatically in a playbook, etc. However, with what I am going to describe, it's doesn't explain at all what Ansible does, but the goal in the end is probably the same: manage things on a server.

With some thought, shell scripts or Python is really actually fine, particularly if you deconstruct the problem into steps. What I did this:

Create a hierarchy of server types

In my configuration directory, I would have a directory for each environment, for example:

prod
staging

Under each one, I would have types of servers. They would be both the same in each environment:

These directories will contain specific files. They could be empty (if they are, we can just add a .gitkeep file for the purposes of Git recording the structure.)

Under each one of the server types, would be directories that would target specific server directories. For example:

cron
etc

I would then have a common directory that would apply to all servers. Assume that all servers participate in shipping the same set of logs from /var/log. Then it would look like this, maybe:

common/
 |_ etc (contains like configs to deliver logs and `logrotate`)
 |_ cron (common crontabs)

Create a list of servers to deploy to ...

... listing them under their type (environment like "production" or "staging") and their role (like "web server" or "database server").

If I have a script called ./server_deps <env>/<type> it will return a list of files in the repository to be mapped with a list of server paths on the target server. It would have to merge common and server-specific files together.

As an example, let's say "prod/db" has db-specific cronjobs, a generic cronjob for log shipping and a database configuration. An output could be:

prod/db/cron /var/spool/cron
common/etc/ /etc
common/cron /var/spool/cron

With that, I can inspect the files that are going into the server before proceeding. This output can be manipulated easily to do things like:

diff the local and server files
hydrate it with rsync-related commands to upload the files
Manually check for sanity.

Create a script to make a build directory

The above output can be mangled to generate a build directory. I can split each line to produce a hydrated cp or rsync command

 function make_build_dir() {
     OLDIFS=$IFS
     IFS=$'\n'

     mkdir -p build/$1 # $1 is the <env>/<type>
     for srcdest in $(./server_deps $1); do
         arr=($(echo $srcdest | tr " " "\n"));
         mkdir -p "build/$1/$(dirname ${arr[1]})"
         echo "Copying ${arr[0]} to /build/$1${arr[1]}" | ts
         cp ${arr[0]} build/$1/${arr[1]}
     done

     IFS=$OLDIFS

     exit 0
 }

This is great then for inspection of the final output. I can write a follow-up deployment script that does the job of moving the files to the correct server. In this case, I need a server inventory. I call it a "server map" here.

As an example, our server map can be:

prod:
    web:
        - web1.myhost
        - web2.myhost
    db:
        - db1.myhost
        - db2.myhost
...

We look up in the inventory what hosts are in (env, type) tuple (like prod/web) and then deploy the files to those hosts verbatim.

(This basic server inventory is like one line of Python code: import yaml; env=yaml.load(open('server_map.yml', 'r'), Loader=yaml.Loader)['$(dirname $1)']['$(basename $1)']; print(' '.join(env))") (with some shell interpolation first ... $1 is like prod/web.)

Other things

In this demonstration, I am not really covering things like merging files, templating files and such. They are just part of the "processing pipeline". But these are relevant to this type of system.

Doing this only when needed

Let's say I make a change to one of the configurations. It should be feasible to only update the servers that actually need to be updating.

It seems to me the best way to use Ansible is to use their actions only. But, this is where it gets hard, if my logic is actually complicated. (It's hard ... for me* at the moment.) If I don't use Ansible's "idempotency" feature and just have it call out a shell script ... well, I can just use a shell to call out to a shell script.

But the idea is tempting a bit, because it's more than just not doing a thing when that thing is done. I can build a state around this by targeting the state to file outputs.

The idea I had was to use make for this. make is used for building software (and anything like this can be a god-send when building large software projects!) But make is like an expert system - it answers in granular detail if a thing is done. The primary thing that make does is ask if what was done was if a file was created or modified. If they were created and not modified, do nothing. If they were modified, so something. The simplest example could be like this (compile a hello.c file to an object file called hello):

hello: hello.c
    gcc -o $@ $<

It says to /only/ compile hello.c if the file hello doesn't exist ($@ and $< are "automatic variables", the first being the target name, hello, and the second being the first dependency.).

make allows you to write a lot of rules at once, all doing the same thing. In my project, I want to create a rule for each server target (prod/web, prod/db etc.) So I write two variables first:

TARGETS=$(shell find prod/* dev/* -maxdepth 0 -type d | tr '/' '-')
TARGETS_DONE=$(shell find prod/* dev/* -maxdepth 0 -type d | sed 's/\//-/g;;s/\(.*\)/build\/done\/\1.done/g')

Hopefully, the Makefile can be ported to different projects eventually, but this is slightly hardcoded here :) . The find command gets a list of <env>/<type> strings in the output from my file system hierarchy I described above and transforms them into rule names, like prod-web. I do this so make is not confused what I am actually targeting. The second case just makes rule names like this /build/done/prod-web.done, which also act as our log files when we do things.

Our rules look like this:

.SECONDEXPANSION:
$(TARGETS): $(DONE)/$$@.done

$(TARGETS_DONE): $(DONE)/
    ./server_deps $(shell echo $(basename $(notdir $@)) | sed 's/\-/\//g')
    ./deploy -p $(shell echo $(basename $(notdir $@)) | sed 's/\-/\//g') -b > $@
    touch $@

$(DONE)/:
    mkdir -p $@

(.SECONDEXPANSION allows expansion of the $@ variable in dependency names, which is otherwise not done.

Basically, if I call make prod-web, it calls $(TARGETS) (that just expands out all the targets we defined) and sees it depends on build/done/prod-web.done. It calls that in the $(TARGETS_DONE) rule (first by creating the directory build/done) and then it will call server_deps to show the files to be deployed then runs deploy to do a deploy procedure, logging the output to the file (if you see above, when I do echos, I add a timestamp with ts in moreutils.

It will:

Log the deployment,
Give me an opportunity to see what was done,
Treat the .done files as "state", which can then be parsed,
Share the build directory as audit evidence of what was changed (the deploy scripts, for example, can proactively log who made changes)
If I use git, I can also log exact commits of when a change was deployed, for error analysis.
I get that idempotency that I wanted.

Going forward

Just to let you know, I will be updating this post. :D

I am still learning make itself, so I need to add more features, like one big one that's missing - how do I redeploy a server when a file is modified?
I left out what I would do with rsync specifically, which could be really important here.
I described no process on how this would apply to server actions (like starting a daemon service.)
I want to be using this to create a rough SBOM on what is related here.
The next part could also involve creative ways of doing administrative tasks that involve snapshotting state properly.

However, it's nice to see that I have roughly a lot of implicit functionality and features in a system with now is roughly 205 lines and a liberal use of common UNIX tools (some bits have a lot of comments.). I use Python for the project to make it easy to parse YAML, but the use of YAML is not really necessary, but at least my own server targets have that as a default.

In: devops