RED Brick Image 1.11: Big performance boost

The RED Brick image version 1.11 is now available:

In the RED Brick image release 1.10 we had many significant changes compared to 1.9. Among other things we updated the linux kernel from 3.4 to 4.13. This was a lot of work for us. We had to port drivers, adapt our SPI communication code etc. We gained some necessary security updates (for example the KRACK WPA fix) and lots of driver support for modern hardware (for example Bluetooth 5.0). Especially because of the security issues with the older Linux kernel, the update was without any alternative from our perspective.

Unfortunately we had to accept a regression in performance. This regression came mainly from changes in the Linux kernel that make multi-core processors more efficient (for example busy waiting for IO in interrupts). While this increases performance in multi-core systems, we saw a decrease in performance on the RED Brick (a single-core system), especially in IO bound loads.

Since we are still selling lots of RED Bricks and the feedback is always very positive, we decided to go down the rabbit whole of increasing the performance for the next image version again.

Among other things we

To make these changes and test them properly we had to invest about two months of work. It took us a lot of trial and error to find the actual performance bottlenecks.

We tested with three different benchmarks:

Benchmark 1: CPU Bound

For the CPU bound performance tests we used sysbench with the parameters "--test=cpu --cpu-max-prime=4096 run".

The governor (if applicable) was set to performance and the connection to the RED Brick was done with SSH through the Ethernet Extension.

The average execution time decreased from version 1.10 to 1.11 is about 5.4ms, which translates to a performance increase of 25%. The performance is now also slightly better then in the 1.9 image with 3.4 kernel.

Benchmark 2: IO Bound

For the IO bound performance tests we used iperf3 with the parameters "-c ishraq-tinkerforge -N -t 120".

The governor (if applicable) was set to performance and the connection to the RED Brick was done with SSH through the Ethernet Extension.

We tested with and without stack. In the tests with stack we used a Master Brick with connceted Thermal Imaging Bricklet. We used the Thermal Imaging Bricklet since it can easily generate enough data to saturate the stack communication.

The graph speaks for itself here: We managed to achieve a very significant performance incerase of 220% compared to 1.10 and 23% compared to 1.9!

Benchmark 3: Stack Communication

For the pure stack communication test we used a Python script on the RED Brick that gatheres thermal data via getter/callback, does no computation with it (throws the data away) and calculates the frames per second.

As you can see we were able to increase the performance to 1.10 in this test, but still have a small regression compared to 1.9. However, during this test the CPU on 1.9 is run at the limit, while we have time for computations left in 1.11. You can see this in effect in the IO bound test, which adds a bit more CPU load, since it has to transfer the data to the PC.

So overall, after a lot of trial and error, we decided that this trade-off (slightly decreased stack communication throughput but significantly more CPU time available for computation) is the way to go. We are convinced that in real-world applications the performance in 1.11 is increased significantly compared to 1.10 and still increased moderately compared to 1.9.