This week I have been dabbling on multiple fronts.

Running tests in CI

Building kernels and squashfs images in Hydra is a Good Start, but it doesn’t give much confidence that the image is going to do anything when run. To test this, we can build an image for the “malta” MIPS variant and then run it in QEMU. To test that it does anything useful, considering that the device is supposed to be a router, we need other network devices for it to connect to. And as we’re running it all in Hydra which uses a private network namespace with no interfaces except loopback, those other network devices need to be provided as part of the test setup.

As I mentioned last week, we currently use RouterOS running in QEMU to provide a simulated ISP. The QEMU VMs for RouterOS and our Liminix device are configured with emulated network devices using QEMU socket networking, and because of the private network namespace we configure it with localaddr=127.0.0.1. I’m not sure that that option is documented anywhere other than the QEMU source

qemu-system-mips
    [... a raft of other options....]
    -netdev socket,id=access,mcast=230.0.0.1:1234,localaddr=127.0.0.1 \
    -device virtio-net-pci,disable-legacy=on,disable-modern=off,netdev=access,m
ac=ba:ad:1d:ea:21:02 \

The nice things about using multicast for this: (1) with regular UDP sockets we have to connect in one VM and listen in the other, meaning that we have to start them in a particular order; (2) we can have more than two VMs attached to the same network.

The other interesting part of this work was scripting RouterOS so that its VM would boot with a useful configuration instead of with a blank fresh install. (Admire the irony: in order to build a router operating system which can be easily managed using version-controlled text files, I need to force another router operating system which was not designed with that goal to be managed using a version-controlled text file). RouterOS on QEMU can be provisioned using the QEMU Guest Agent - while I couldn’t find a general-purpose host-side utility that communicates with the guest agent, there is a page on the Mikrotik Wiki with an example that I was able to adapt.

Running builds in CI

I copied and adjusted the configuration for GL.Inet Mango and Azure from NixWRT. I haven’t yet tested booting either of them as I don’t have any suitable hardware devices that aren’t in use somewhere. This adds to the existing GL-AR750 support to give us three hardware targets plus QEMU

Packaging cleanups

  • we were amassing a small collection of shell scripts and utilities for the build machine. For better DX I turned these into derivations available under liminix.buildEnv

  • found and fixed packaging bugs in the Liminix development TFTP server caused by Nixpkgs fennel changed. You may reasonably ask why Liminix even contains a TFTP server - I may have got carried away, but the USP is that it has an allow-list for client connections and it follows symlinks. This is desirable for Liminix testing because you probably don’t want to allow access from anything other than the device you’re testing on, and you want to point it at ./result without messing about copying files into /var/tftp and what-have-you. Anyway, have now added that as a CI target as well so we’ll know if it breaks again.

  • turned the per-device configuration from an overlay-plus-a-kconfig-attrset into a module. Given that modules can set kernel config options (e.g. you can say kernel.config.SERIAL_OF_PLATFORM = "y") there seemed no point in having two mechanisms. This means that per-device code can now change any other part of the config not just the kernel, so also there’s no extra logic needed for setting the boot commandline or default output format or anything like that. I love removing special cases, it’s my favourite part of programming.

  • rearrange the PHRAM/TFTP boot support to not be obviously broken, and also to calculate image sizes and offsets instead of leaving that to the user. No more weird bugs where the kernel stomps on the start of the root image. Whether it is non-obviously broken I will find out next week.

Yes, next week I’ll be looking at hardware. I have hooked up the GL-AR750 to a serial console, a remote-controlled power switch, and a spare ethernet card in my desktop. Next I need to

  • try booting the image that Hydra built
  • figure out wlan support, which I remember as being gnarly but hopefully my extensive notes from last time will help with
  • connect it to something upstream. I’d like it to run PPPoE (because that’s a typical use case for a home wifi router) and I’d like to hook it up to the AA.net L2TP service (because that means I don’t need to break broadband for the rest of the household while I’m hacking on it). I am hoping that go-l2tp can bridge the gap

Once I have something I can dogfood then I hope things will settle down and I can focus on one thing at a time.