Oh well, unfortunately Mark has moved on to new adventures, so I’ll try to sum up what he has been up to as part of the balenaOS team.
During the last few months Mark has been working on improving the OS time synchronization, fixing issues that we have identified through our support loop. Things like implementing a fake hardware clock so time is maintained across reboots on devices without a real RTC, implementing an early https time sync service so that system services can start with a good enough time even though NTP sources have not yet synched, ordering time dependent services to start after the time has been synched, and fixing several other smaller issues with the use of Chrony, the NTP client used in balenaOS.
An important project that Mark has also been maintaining is our brownfield migrator tool which allows to remotely migrate devices on the field running a variety of operating systems to balenaOS.
And finally, Mark has been adding support for the Jetson TX2 device type to our automated test framework. OS testing is an important focus of the OS team right now, as there is a lot of overhead in the manual testing and release process that we are using at the moment. The near term roadmap includes moving to continuous automated releases of OS updates for all devices types and completely automated testing.
Thank you Mark for your contributions to making balenaOS a better operating system.
And hijacking the thread, I might as well fill up on the stuff I am working on.
A big part of my time has gone into a review of the OS architecture with the aim of simplifying as much as possible the addition of new devices types. The new balenaOS architecture is multi-container based, and defines OS blocks that can be developed and maintained independently. This new design will allow to support new device types by just providing the BSP specific container block while re-using the rest of the OS.
Moving along that roadmap, I have been working along with @pipex from the supervisor team to support overlay container extensions, the first type of OS blocks that are managed by the supervisor. For this, a new v3 target state is being introduced that moves the product closer to the multi-app utopia, where the system is running several apps (OS, supervisor and user application), while managing them all in the same way not only in the supervisor but also in the cloud API.
Part of the work above has seen me merge the development and production OS variants into a single image that is now runtime configurable into development or production mode, and I am currently working on revamping the OS release versioning so it uses the same mechanism as the supervisor and user apps. Release versioning requires the introduction of an OS contract that will also allow OS blocks to be identified as compatible.
Proper OS release versioning is also required in order to deploy draft OS releases to the cloud, which will allow to move to a continuous release process in which every PR is deployed as a draft release to the cloud, and automatically marked as final by the automated test framework once validation passes. This will avoid the current delay between a feature being available in meta-balena and released for a specific device type.
For the last few weeks I have also been working alongside @mtoman to finalize the secure boot and full disk encryption work for x86_64 devices that Michal has been working on for the last few months. This is in its final stretch and we hope to have something released very soon.
There are always other minor tasks that need attention, but the above sums up the core of my current work and focus.
And next, I’d like to call out to @lmbarros to explain some of the great work he has been doing on improving the engine pulls reliability and healthcheck mechanisms!