There has been great news for OpenMandriva a couple of days ago: We got a nice large box in the mail.
Even though the weight of the box was pretty impressive when I had to carry it to the second floor, the content turned out to be far more impressive - a Mt. Collins server with Ampere Altra CPUs - 160 CPU cores and 128 GB RAM, given to us by Ampere Computing to support our aarch64 development efforts.
Quite a step forward from the eMAG we’ve been using as our main aarch64 builder so far, and not even comparable to the various Rock Pi 4 boards that serve as our backup builders.
The front side of the server has 24 2.5" NVMe U.2 SSD slots that can be hotplugged, as well as 2 USB ports and a VGA port.
If you are coming from the desktop or embedded world, don’t laugh — VGA ports are still very common (and useful) in the data center - many KVMs or video capture interfaces have not moved on to DisplayPort or HDMI.
On the back side, there are numerous network interfaces (for both the main machine and the BMC (Board Management Console - a networked interface that can control the main computer)), 2 more USB ports, a VGA port, an RJ45 plug for the BMC’s serial port, 2 PSUs (only one is needed, the other is a backup), and various slots for PCIe cards - so if we want to have fun and abuse the machine for something it was definitely not meant to do, we can plug in a PCIe graphics card, a PCIe sound card, and have way more cores in our gaming machine than all those Windows losers. ;)
Beyond the usual power plugs, the box also includes C-14 to C-13 power cords - very useful if you want to connect it to a power distribution unit so you can power-cycle it remotely if something goes wrong, or to a UPS that uses this type of plug.
Time to connect the box into our network and see what it can do!
It’s not like we can afford a professional 19" rack, but the box fits nicely into our infrastructure - below the Altra is the PDU and a network switch, and below that is C64 (no, not what you are thinking. Obviously that stands for "a Computer with 64 GB RAM"...), one of our main x86 builders.
The box boots up without problems - showing a very PC-BIOS-like firmware setup tool and a pretty standard BMC. It is preinstalled with CentOS Stream 8.
The PDU is not an accurate measuring tool for power consumption, but it gives a good indication: Power consumption went from 0.8A with C64, a RISC-V builder and a network switch connected to 1.5A - so the power consumption of this 160-core box is comparable to that of a 16-core Threadripper.
It takes 7 minutes and 4 seconds to boot to a point where the network is available though — and when restarting, further 6 minutes and 55 seconds to get back to the start (making a reboot take just below 14 minutes). While not a big problem (this is not the type of computer you reboot all the time), that should be something we can improve...
So let’s see what happens if we just plug in an OpenMandriva USB stick and reboot...
Two pleasant surprises first — we didn’t have to make any modifications to boot on this machine, it can use the same aarch64 image as a much lower powered Synquacer or eMAG. And even when booting from the USB stick rather then the built-in high-speed NVMe, the boot process was comparably fast - down to 4 minutes and 52 seconds.
A newer kernel and toolchain can work miracles - and this is one of the reasons why we insist on highlighting that "stable" is spelled with a "b". A server operating system has to be stable, not necessarily stale.
While staying on an LTS kernel branch forever does have the advantage of the least number of possible surprises, it also means missing out on a lot of improvements - and causes all the more trouble when finally updating to something newer because the old code is abandoned upstream, or can’t support a new hardware component or networking protocol that has become necessary.
It is probably overdone to run ROME (our rolling release version) or even Cooker (our development branch) on a server and keep track of changes on a daily basis - but that’s what the stable "Rock" (point) releases are for.
With the storage devices immediately recognized, installing was as painless as running the install script.
The subsequent reboot took 17 seconds rather than almost 7 minutes - and it takes 4:24 to get from turning the machine on to a usable OpenMandriva prompt when running on the NVMe.
Now, with the OS we want installed, it’s time to put the box to good use, and use it for what we need it for: compiling.
Installing our build system and connecting it into ABF, our Automated Build Farm, is a matter of running a simple docker container. Unsurprisingly, there were no problems doing that.
Build times are really good: Building our kernel package (which builds the kernel 4 times — once with clang in a desktop configuration, once with clang in a server configuration, once with gcc in a desktop configuration, once with gcc in a server configuration — all 4 kernel variants have almost all modules enabled, given a distribution kernel can never know what it will be run on) took 2 hours, 14 minutes compared to 3 hours, 37 minutes on the Threadripper.
Building libreoffice took it 2 hours, 8 minutes compared to 4 hours, 4 minutes on the Threadripper.
And it could get even faster - the bottleneck turns out to be memory. While 128 GB is a lot of space, it’s only 0.8 GB per core on a 160-core machine. If we can make full use of the CPU with processes that are also memory intensive (keep in mind OpenMandriva uses LTO (link time optimizations) in just about everything - that means the entire code has to be visible to the linker at the same time), memory gets tight.
Now, for the first time, we have aarch64 packages churned out at a much faster rate than their x86_64 and znver1 counterparts - to the extent that I was curious about running an x86_64 environment in qemu on the Altra and running an x86_64 builder in there.
It works - but qemu is not fast enough (yet) to beat the Threadripper in this setup. But making packages more crosscompiler friendly (something we are working on already — stay tuned) should help a lot.
Does our new favorite aarch64 machine mean we’ll move our focus on the aarch64 side to servers only? Not at all. While we do have some new aarch64 server projects like getting OpenMandriva into Oracle Cloud (running on the same Altra processors there - both in bare metal and virtual machines - is already working), we will not go the way of some other distributions and require UEFI and ACPI for our aarch64 port.
Devices like the PinePhone or the PineBook Pro, or the Raspberry Pi, or an upcoming project we aren’t talking about yet, are equally important, and a fast aarch64 builder is helping them a lot. Even though some of the target devices need a custom kernel, or a custom bootloader, or special firmware, or all of that, we support all devices from a common tree using a new tool, os-image-builder, that we will introduce in another article soon.