About 10 years ago I was introduced to the concept of “configuration as code”. In the years that followed I’ve had some success but in recent days it’s been one failure after another.
The lesson I learned is that whatever your production environment is… from dev, staging, test and production; you need to be committed to that implementation. As any of the systems change so does the configuration and it’s very easy to add an impeadance mismatch from one milestone to another.
This also means that there is a dependency between the layers of all things. That’s everything from version control systems, project naming conventions, network design, SDLAN, containers, databases, public IPs and domain names, VLANS and so on.
One thing that most admins forget is what happens when applying upgrades and the entire system fails. Or when the various projects supported by the “domain” skews away from the base.
Recently a docker swarm I had been nursing stopped working when I changed the underlying DHCP address range. While they were limping along now they were dead. Redeploying the swarm meant redeploying the entire domain. There is something to be said about deploying the entire stack as that is basically what configuration as code means, however, then you invite other problems like the never ending domain/project builds.
One thing I’m taking away from this weekend’s networking issues is that dependencies suck. I deployed a new firewall, LTE backup network, segregate networks with VLANS… while this sort of deliberate systems… you have to experience real failure to know whether you’ve got it right.