... or what I am jokingly referring to as a "poor-man's Ansible". :D
I have a little project for myself to review a set of servers and see what files I manage. I started to learn Ansible specifically to see if that could be useful, but I decided that it is too big to solve what can be done carefully in a few scripts.
If I am managing files on servers that do different things, then I should make some logic around it, like "a server of this type in this environment gets these files." Then, uploading a few files is not expensive, but we want to keep an ongoing log of those changes and only update if those changes should be done.
It does sound like something that could be done with Ansible: I have a server inventory, it can be described programmatically in a playbook, etc. However, with what I am going to describe, it's doesn't explain at all what Ansible does, but the goal in the end is probably the same: manage things on a server.
With some thought, shell scripts or Python is really actually fine, particularly if you deconstruct the problem into steps. What I did this:
In my configuration directory, I would have a directory for each environment, for example:
Under each one, I would have types of servers. They would be both the same in each environment:
These directories will contain specific files. They could be empty (if they are, we can just add a .gitkeep file for the purposes of Git recording the structure.)
Under each one of the server types, would be directories that would target specific server directories. For example:
I would then have a common
directory that would apply to all servers. Assume that all servers participate in shipping the same set of logs from /var/log. Then it would look like this, maybe:
common/
|_ etc (contains like configs to deliver logs and `logrotate`)
|_ cron (common crontabs)
... listing them under their type (environment like "production" or "staging") and their role (like "web server" or "database server").
If I have a script called ./server_deps <env>/<type>
it will return a list of files in the repository to be mapped with a list of server paths on the target server. It would have to merge common and server-specific files together.
As an example, let's say "prod/db" has db-specific cronjobs, a generic cronjob for log shipping and a database configuration. An output could be:
prod/db/cron /var/spool/cron
common/etc/ /etc
common/cron /var/spool/cron
With that, I can inspect the files that are going into the server before proceeding. This output can be manipulated easily to do things like:
The above output can be mangled to generate a build directory. I can split each line to produce a hydrated cp
or rsync
command
function make_build_dir() {
OLDIFS=$IFS
IFS=$'\n'
mkdir -p build/$1 # $1 is the <env>/<type>
for srcdest in $(./server_deps $1); do
arr=($(echo $srcdest | tr " " "\n"));
mkdir -p "build/$1/$(dirname ${arr[1]})"
echo "Copying ${arr[0]} to /build/$1${arr[1]}" | ts
cp ${arr[0]} build/$1/${arr[1]}
done
IFS=$OLDIFS
exit 0
}
This is great then for inspection of the final output. I can write a follow-up deployment script that does the job of moving the files to the correct server. In this case, I need a server inventory. I call it a "server map" here.
As an example, our server map can be:
prod:
web:
- web1.myhost
- web2.myhost
db:
- db1.myhost
- db2.myhost
...
We look up in the inventory what hosts are in (env, type)
tuple (like prod/web
) and then deploy the files to those hosts verbatim.
(This basic server inventory is like one line of Python code: import yaml; env=yaml.load(open('server_map.yml', 'r'), Loader=yaml.Loader)['$(dirname $1)']['$(basename $1)']; print(' '.join(env))")
(with some shell interpolation first ... $1
is like prod/web
.)
In this demonstration, I am not really covering things like merging files, templating files and such. They are just part of the "processing pipeline". But these are relevant to this type of system.
Let's say I make a change to one of the configurations. It should be feasible to only update the servers that actually need to be updating.
It seems to me the best way to use Ansible is to use their actions only. But, this is where it gets hard, if my logic is actually complicated. (It's hard ... for me* at the moment.) If I don't use Ansible's "idempotency" feature and just have it call out a shell script ... well, I can just use a shell to call out to a shell script.
But the idea is tempting a bit, because it's more than just not doing a thing when that thing is done. I can build a state around this by targeting the state to file outputs.
The idea I had was to use make
for this. make
is used for building software (and anything like this can be a god-send when building large software projects!) But make is like an expert system - it answers in granular detail if a thing is done. The primary thing that make
does is ask if what was done was if a file was created or modified. If they were created and not modified, do nothing. If they were modified, so something. The simplest example could be like this (compile a hello.c
file to an object file called hello
):
hello: hello.c
gcc -o $@ $<
It says to /only/ compile hello.c
if the file hello
doesn't exist ($@
and $<
are "automatic variables", the first being the target name, hello
, and the second being the first dependency.).
make
allows you to write a lot of rules at once, all doing the same thing. In my project, I want to create a rule for each server target (prod/web
, prod/db
etc.) So I write two variables first:
TARGETS=$(shell find prod/* dev/* -maxdepth 0 -type d | tr '/' '-')
TARGETS_DONE=$(shell find prod/* dev/* -maxdepth 0 -type d | sed 's/\//-/g;;s/\(.*\)/build\/done\/\1.done/g')
Hopefully, the Makefile can be ported to different projects eventually, but this is slightly hardcoded here :) . The find
command gets a list of <env>/<type>
strings in the output from my file system hierarchy I described above and transforms them into rule names, like prod-web
. I do this so make
is not confused what I am actually targeting. The second case just makes rule names like this /build/done/prod-web.done
, which also act as our log files when we do things.
Our rules look like this:
.SECONDEXPANSION:
$(TARGETS): $(DONE)/$$@.done
$(TARGETS_DONE): $(DONE)/
./server_deps $(shell echo $(basename $(notdir $@)) | sed 's/\-/\//g')
./deploy -p $(shell echo $(basename $(notdir $@)) | sed 's/\-/\//g') -b > $@
touch $@
$(DONE)/:
mkdir -p $@
(.SECONDEXPANSION
allows expansion of the $@
variable in dependency names, which is otherwise not done.
Basically, if I call make prod-web
, it calls $(TARGETS)
(that just expands out all the targets we defined) and sees it depends on build/done/prod-web.done
. It calls that in the $(TARGETS_DONE)
rule (first by creating the directory build/done
) and then it will call server_deps
to show the files to be deployed then runs deploy
to do a deploy procedure, logging the output to the file (if you see above, when I do echo
s, I add a timestamp with ts
in moreutils
.
It will:
.done
files as "state", which can then be parsed,build
directory as audit evidence of what was changed (the deploy scripts, for example, can proactively log who made changes)git
, I can also log exact commits of when a change was deployed, for error analysis.Just to let you know, I will be updating this post. :D
make
itself, so I need to add more features, like one big one that's missing - how do I redeploy a server when a file is modified?rsync
specifically, which could be really important here.However, it's nice to see that I have roughly a lot of implicit functionality and features in a system with now is roughly 205 lines and a liberal use of common UNIX tools (some bits have a lot of comments.). I use Python for the project to make it easy to parse YAML, but the use of YAML is not really necessary, but at least my own server targets have that as a default.
Made with Bootstrap and my site generator script.