BellSoft recently added support for AArch64 to Alpaquita Linux. Alpaquita Linux is a minimalistic Linux distribution for cloud deployments. It is 100% compatible with Alpine Linux, but comes with several enhancements, including two libc implementations: optimized musl (musl-perf) and glibc.
In a previous article, we summarized the results of Linux performance studies on x86. This article provides a summary of performance studies on ARM, where Alpaquita Linux was compared to other popular Linux distributions for cloud using industry-standard benchmarks.
Table of Contents
Methodology
We ran the tests in a virtual machine on Ampere Altra ARMv8 Neoverse-N1 CPU, which is a server-class machine optimized for handling various cloud-native workloads efficiently. Setup:
- 4 cores
- Full virtualization
- Type 1 hypervisor: KVM
- Type 2 hypervisor: QEMU
A single VM was running on the machine as a workload, so it was dedicated to performance measurement. The following command was used to start the QEMU:
qemu-system-aarch64 -cpu host -enable-kvm -hda alpaquita-stream-musl.qcow2 -smp 4 -m 8192 -device virtio-net-pci,netdev=net0 -netdev user,id=net0 -display none -daemonize -machine virt -drive if=pflash,unit=0,readonly=on,file=/usr/share/AAVMF/AAVMF_CODE.ms.fd -drive if=pflash,unit=1,format=raw,file=AAVMF_VARS.ms.fd
As Alpaquita Linux comes with two libc implementations, glibc and musl-perf, we tested musl- and glibc-based distributions to compare the performance of two libc implementations. Another musl-based distribution in the tests was Alpine Linux, which comes with stock musl. We also used several popular glibc-based Linux distributions for the cloud: Debian, RHEL, and Oracle Linux.
A full list of tested Linux distributions:
- Alpaquita Linux 23 musl (LTS)
- Alpaquita Linux 23 glibc (LTS)
- Alpaquita Linux Stream 23 glibc
- Alpaquita Linux Stream Slim 23 glibc
- Alpaquita Linux Stream 23 musl-perf
- Alpaquita Linux Stream Slim 23 musl-perf
- Alpaquita Linux with Liberica Native Image Kit, a GraalVM CE-based native-image compiler
- Alpine Linux 3.20
- Debian 11 Slim
- Debian 12
- Debian 12 Slim
- RHEL 9 UBI
- Oracle Enterprise Linux 9
Results
Docker image size
We measured two metrics: base Linux Docker image size and Docker image size with JDK 11.
The base Docker image size of musl-based Alpaquita Linux is 3.38 MB, which is almost 9 times less than that of Debian.
The JDK Docker image based on Alpaquita Linux musl is 77 MB, which is 2.7 times smaller than with Debian.
Startup
The system startup time was measured at different stages:
- initramfs init: initialization of a root filesystem providing early userspace;
- mounted root: the root filesystem is mounted;
- login: the system allows to log in through console;
- iface is up: the interface is up and running, which allows working with the console;
- “network”: network services are connected.
We also measured application startup using Petclinic, a reference Spring Boot application.
Startup results
The results of Linux startup studies show that Alpaquita Linux startup till network is 2.5 times faster than that of Debian.
As for the application startup, Alpine Linux with JDK demonstrated the worst results. Application startup with Alpaquita Linux with JDK was similar to that with Debian. Petclinic startup with Liberica Native Image Kit took only 0.5 seconds, which is 8 times faster than with JDK.
Performance of glibc vs musl vs musl-perf
In some situations, the performance of musl libc can be inferior to that of glibc. To solve such issues, we developed optimized musl (musl-perf), which is 100% compatible with stock musl, but has improved performance. The tests in this section were aimed at evaluating the performance of three libc variants: stock musl, musl-perf, and glibc.
String operations
We ran basic String tests with 1 million iterations and various String lengths: 36 bytes, 123 bytes, and 4,100 bytes because the String size may affect the performance significantly.
Results of String operations, 36 bytes
Results of String operations, 132 bytes
Results of String operations, 4,100 bytes
We can see that stock musl demonstrated the worst results in all tests. We can also see that musl-perf has the performance similar to that of glibc.
SPEC CPU 2017
The SPEC CPU® 2017 benchmark package includes tests for measuring compute intensive performance. We used two time-measuring suites: SPECspeed®2017 Integer and SPECspeed®2017 Floating Point. The results of both benchmarks are measured in Ratio, which is the run time on the reference platform divided by time on this system. When comparing systems, the system with the higher ratio does more computing per unit of time.
Results of SPECspeed®2017 Integer
Results of SPECspeed®2017 Floating Point
The results of SPEC CPU 2017 tests show that stock musl had inferior performance in most cases compared to glibc and musl-perf. The musl-perf libc demonstrated similar or superior performance to that of glibc in most cases.
To sum up, the benchmarking results indicate that enterprises running their workloads on a glibc-based Linux distribution can migrate to Alpaquit Linux with musl-perf without sacrificing performance.
malloc performance
Linux memory allocators (mallocs) can influence the performance of applications. Alpaquita Linux for AArch64 comes with two additional malloc implementations:
- mimalloc is a small allocator used in large scalable services with low latency
- jemalloc enables the developers to solve fragmentation issues and supports scalable concurrency
We tested the performance of Alpaquita Linux musl with mimalloc as compared to the default malloc in other distributions.
To test the work of mimalloc in various Linux distributions, we used a Mimalloc-bench. The tests measure the number of operations performed in one second.
The malloc benchmarks we used for the study are:
- espresso: a programmable logic array analyzer in the context of cache aware memory allocation
- alloc-test: simulates intensive allocation workloads with a Pareto size distribution
- cache-scratch: introduced with the Hoard allocator to test for passive-false sharing of cache lines
malloc test results
The results of the studies show that Alpaquita Linux musl with mimalloc outperforms other distributions with default mallocs in all tests except for one, cache-scratch-1, where all distributions showed similar results.
Java benchmarking with DaCapo
DaCapo is a set of real-world Java applications used for measuring system and CPU performance. The results are measured in ms required for the completion of a workload.
DaCapo results
In all tests, both musl- and glibc-based Alpaquita Linux distributions demonstrated equal or superior performance to that of other distributions. In several tests, including PMD Source Code Analyzer and Avrora AVR Simulation Framework, musl-based Alpaquita demonstrated the best results, which indicated that using musl libc may be more beneficial for some Java workloads than glibc.
Memory bandwidth
We used the Stream benchmark provided by Phoronix via the Phoronix Test Suite to measure memory bandwidth, i.e., the memory volume we can use at a given time. Stream is a popular RAM benchmark measuring the sustainable main memory bandwidth in MB/s and the computation rate for simple vector kernels. It uses four kernels for different memory operations:
- Copy: transfer rate measurement without arithmetic operations
- Scale: adding a simple arithmetic operation
- Triad: chained/overlapped/fused multiply/add operations
- Add or Sum: adding a third operand
Stream results
All distributions demonstrated similar results.
Throughput
To test the throughput of Linux distributions, we used the Nginx benchmark provided by Phoronix. This benchmark runs on a single host and measures the number of HTTP requests handled per second with a configurable number of concurrent clients.
Nginx results
Both glibc- and musl-based Alpaquita configurations demonstrated the best results across all tests. Remarkably, glibc-based Alpaquita outperformed other glibc-based distributions, which means that companies that do not want to migrate from glibc to musl can benefit from Alpaquita Linux with glibc.
AsmFish/TSCP
AsmFish and TSCP provided by Phoronix are chess benchmarks for analyzing CPU performance. TSCP (Tom Kerrigan’s Simple Chess Program) is a small chess engine that calculates how many nodes per second are searched. AsmFish is an advanced chess engine benchmark written in Assembly. The results of both benchmarks are calculated as nodes per second.
AsmFish/TSCP results
The results of both tests show similar performance of all distributions, with Alpine Linux lagging slightly behind in the AsmFish test, which means that optimized musl outperforms stock musl for these workloads.
Petclinic RAM footprint & latency
We measured RAM footprint and latency of a Java service based on various Linux distributions at the following conditions:
- Application: Spring Petclinic
- Low-end hardware
- Low load of 133 users and 54 requests per second (RPS)
Apart from the combination of JDK and Linux in this series of tests, we also used Alpaquita Linux with Liberica Native Image Kit.
RAM consumption results
The results of footprint studies show that Oracle Linux 9 demonstrated the highest RAM consumption. Alpaquita Linux with Liberica Native Image Kit demonstrated the lowest RAM consumption, which is 65% better than in case of Oracle Linux. The second and third best results belong to musl- and gilbc-based Alpaquita Linux with JDK respectively.
Latency results
As for the latency studies, all Linux distributions combined with JDK showed similar results. Liberica Native Image Kit demonstrated 4.7 lower latency at 99 pct as compared to JDK.
Conclusion
The benchmarking results show that
- Alpaquita Linux musl demonstrates the best results in terms of Linux startup;
- In cases where glibc demonstrated superior results to stock musl, Alpaquita Linux with optimized musl has similar or superior results compared to glibc-based distros;
- Both musl-based and glibc-based Alpaquita distros show better results than other distributions for Java workloads in terms of RAM consumption;
- Liberica Native Image Kit with Alpaquita shows the best results in terms of a Java application startup time, RAM footprint, and latency.
To sum up, Alpaquita Linux provides the best options for enterprises running their Java workloads in the cloud:
- Optimized musl with superior performance as compared to stock musl in Alpine;
- A glibc-based version smaller than other popular glibc-based distributions for enterprises not willing to migrate to another libc;
- Several additional malloc implementations for various workloads;
- Commercial support with LTS releases;
- Tools for Java development.
Try Alpaquita Linux out with your application and see the difference!