This week in Liminix: a change in the plan. Much to think about, nothing to show

The original plan

In this phase we revisit this decision, and figure out how to gain the advantages of the Nixos options/type system while still allowing multiple instances of the same module

I spent a day or so thinking about this. The two fundamental tools for abstraction in Liminix are the module and the service.

  • Liminix modules use the Nixos module abstraction (though not the actual modules themselves). A module can change global config - for example, to add users, groups etc, or add kernel symbols. Modules get their configuration by looking at some subtree of the big configuration attrset, and can define what “well-formed” looks like for the attribute values in that subtree.

  • Services, on the other hand, are derivations that create s6 service directories. There is not a lot of type checking in service parameters, but what they do allow is having multiple instances of the same thing. A prime example of when you’d want this is on a device like the GL-AR750 that has two wireless radios (for 2.4GHz and 5GHz) that must each have their own hostap service. The config for each service is different, so it would be quite unwieldy to represent this using a single configuration object?

    (Yes, we could do = {...} and = {...} - for this application we can anticipate we might need more than one daemon. But do we do that for every module that might ever possibly need multiple services just in case, and how do we guess what they’re going to be? Someone somewhere is going to want to run two http daemons, or four different ssh daemons on different ports, or something even weirder. DHCP clients on two hardware network interfaces and a VPN device that only exists while some other service is running and a named pipe connected to a Python script using scapy. Who knows? We may not even know what the config keys are for some of these potential use cases)

With that in mind, I’ve decided to defer the redesign of modules until later down the line when I also need to do some serious thinking about services. There’s no point considering one without the other.

Writable filesystem

So onto the next part of the plan. As of March 2023 the Liminix filesystem is a “squashfs” image, meaning that the filesystem is generated at build time and read-only at runtime. Any updates or reconfiguration must be made by producing and flashing an entire new image, which is fine if you’re making a big change anyway but not very convenient if you just wanted to install netcat or change the wifi channel.

I don’t have anything here in commitable form yet, but I note the following as bullet points:

  • our devices use “raw flash”, so we should be using filesystems designed for raw flash (like jffs2 or ubifs) and not traditional block-device filesystems like ext[234]. This is to make sure that writes to the file system are wear-levelled (distributed across the device and not all done at the same spot).

  • there are two kinds of raw flash: “nor flash” devices are usually smaller and are guaranteed (sic) to be free of bad blocks, whereas “nand flash” devices are bigger, but may develop bad blocks over time (just like hard disks did in days of yore) that the driver/filesystem is required to deal with. Therefore we’ll be using ubifs for nand flash because it’s based on UBI that tracks erase counts, and jffs2 for nor flash because we don’t need that behaviour there, and because it turns out that ubifs has a minimum device/volume size which our nor flash devices are too small to qualify for.

  • as followers of the gospel of Graham C we’d like to not make the entire storage device writable and end up with random state accruing on it. We expect that updates will be performed by building new packages on a separate build system and using nix-copy-closure to deploy, and that we should be able to generate any/everything in /etc, /var etc at boot time based on what’s in the store and what’s available from the (yet unimplemented) “secrets” provider.

  • OpenWrt have a feature where they combine squashfs and jffs2 using a filesystem overlay - the idea being that because squashfs compresses better, they can ship the initial image as squashfs and then any post-install changes are made to the jffs2 overlay.

    This is neat, but I am not presently planning to do the same in Liminix - or at least until I have numbers to demonstrate it’s useful. My thinking is that whereas an OpenWrt update to libc might just overwrite some files in /lib and all the existing binaries in the squashfs now find the new libc in jffs2, in a Nix system we don’t have that. Upgrading libc means new builds of all the packages using it - because they previously depended on /nix/store/01234567-libc and now they depend on /nix/store/89abcdef-libc. If the old binaries were in jffs2 they could be overwritten and the space reused, but if they were in the immutable squashfs then that’s now so much wasted space. </handwaving>

    Anyway, we’ll see. It’s less complicated to not do it, so that will be my default position.

Where I am right now, it looks like this means we’ll be using initramfs to boot the system and then have an early userland that mounts the store from jffs2/ubifs and runs a script to recreate /etc and friends based on what’s in it.

It ought to work, anyway. Next week maybe I can talk about it in the past tense and not the future conditional.