6
NixOS in production?
(sopuli.xyz)
From Wikipedia, the free encyclopedia
Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).
Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.
Community icon by Alpár-Etele Méder, licensed under CC BY 3.0
I only use it for personal servers so calling it "production" is maybe debatable, but in my opinion NixOS is the end game of server distributions and indeed where it shines the most.
The single interface to manage the entire system including installed programs and service configuration makes it very easy to keep configuration between different services or even machines in sync. It's pretty much what ansible wants to be but it doesn't feel like it's held together by duck tape (note that doesn't mean there's no duck tape in NixOS, some things it does like fixed shared library paths kinda need it, but it's a lot more contained). And since it's all configured with a Turing-complete programming language, that means you can abstract your configuration however you need.
Checked configuration along with easy rollback also means that system upgrades are not a thing to worry about: if your configuration becomes invalid after a NixOS update (though I think there's no incompatible updates inside the same release branch unless you're running unstable), your system will fail to build, and if it does fail at runtime, you can just boot your old configuration.
Another highlight is the builtin VM build target which builds your configuration as a runnable VM for testing. I've used it a couple times to test new configuration such as when I was setting up Nextcloud, before pushing it to the actual server. (I'm just missing a good way to spin up multiple VMs in a virtual network for testing distributed services. I'm sure people have already done something like that outside of mainline NixOS, though.)
The only big issue I think it has is that the Nix language has a definite learning curve compared to other distros if you're not familiar with functional programming, and then NixOS and nixpkgs also introduce their own concepts, and centralized documentation can definitely be lacking (though there are efforts going towards fixing that) so a lot of the time you have to dig around for blog posts or read the source code. But overall I think getting over the initial hurdle until you're comfortable is so worth it.
I'm probably missing a lot of other stuff it has that I'm taking for granted at this point, but I think these are the major points.
Thank you for your extremely thoughtful response. One thing I need to research for myself is how user data, configuration, etc. is handled with a rollback. In MicroOS, the system itself is snapshotted by default (/etc for example) but user data (/home) is not. I would assume that a rollback would not touch the user data, so an upgrade could break that (e.g. database migration performed by a package update).
If you're always able to roll back, is taking the time to build and run a VM really worth it except for very critical maintenance windows? Does the VM just copy the nix config or the data too? Today, I use Vagrant for testing my Salt states. It does what I need.
I don't come from functional programming, but I'm sure I'll figure it out. Would have been nice for it not to have its own language, but I suppose that's most efficient instead of repurposing / abusing another language in a weird way.
NixOS doesn't handle stateful application data at all (especially it doesn't touch home directories at all other than creating them, though there's home-manager if you wish to do that). But I think big incompatible updates are kept between major NixOS versions, at least for stuff like databases, and I've never ran into this being an issue, except for when I wrote some configuration to move some files to new locations and then wanted to roll back, but that's on me (and it was easy to cherry-pick the parts that used the new paths). Having snapshots for /var and so on might be a good idea, though personally I just keep daily backups of those directories.
The VM is less useful for testing package upgrades and other small changes and more for configuration you don't want to run on the actual server yet (such as when you're writing it at the moment and it's not complete and just want to test how much of it works), like testing software locally before pushing it to production. The VM build takes about as much time as building the normal configuration (and very short if that is already built). It produces a script which starts a QEMU VM using a blank disk file which it will reuse (of course, some services may fail due to missing keys). But I don't think that's a problem and it also kinda is another check that your system won't horribly fail if you ever want to reinstall it completely before restoring a backup.
Cool! Thank you. I appreciate the detail with which you write. I'm going to give this a shot, I think!