Boot partition already mounted RW

Hello,

When setting a static IP address on some of our devices, we noticed that the boot partition was already mounted RW in the Host OS. Previously this was not the case (as reflected in our internal documentation).

We are wondering why this has changed?
We were happy with the boot partition being RO, as this is something you see commonly on embedded devices.
We are also wondering if we should worry about the possibility of the file system getting corrupted, for example because of power failures, as it’s a vfat file system.

We are using OS version: balenaOS 2.73.1+rev2 on Beaglebone Black hardware.

mount output:

# mount | grep boot
/dev/mmcblk1p1 on /mnt/boot type vfat (rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=ascii,shortname=mixed,utf8,errors=remount-ro)

Hey @mlout, I don’t have an answer for you just yet but I can confirm that this is the case not only on BBB hardware but Raspberry Pi’s too, I’ve tested on a variety of devices ranging from balenaOS 2.46 to latest and all seem to have the boot partition mounted with rw. I’m certain as you claim this was not the case before and our docs even support this: What is balenaOS? - Balena Documentation.

I’m going to ask the OS team to clarify this for us and be back to you.

Okay I’ve cleared up the confusion with the OS guys.

The boot partition is not meant to be mounted as read only, it’s required as rw because it’s where changes to the device configuration are made, for example updating the “API poll interval” from the dashboard will update /mnt/boot/config.json file and so on. What is read-only at runtime is the root filesystem itself (/), except for the overlays described in the documentation I linked before. This does mean that the diagram shown in there is incorrect. It states that the resin-boot and resin-rootA/B partitions are mounted as read-only while in reality they are rw. I’ll see that we update that to avoid further confusion.

About the possibility of the fs getting corrupted I’ll quote my colleague:

About the corruption on power loss, we always sync the filesystem after a write to minimise the chances of corruption, but we have seen cases where we missed to sync like config.json can be affected by power cuts · Issue #1983 · balena-os/meta-balena · GitHub. For some time we explored using an IO library that would make atomic changes to the FAT partition but it has been left on hold. We are not seeing this as a problem in our fleets and the current solution seems to work fine, so I am reluctant to add more complexity. But we would add atomic writes if there was a need.

Other than that, since user applications run as containers and bind mounts are not allowed there is little risk of an application accessing boot or sysroot partitions. If you see another scenario where this might be a concern please share it so we can take it into consideration.

1 Like

Thanks for the explanation. However, on the topic of solving possible corruption. Isn’t is more logical to only mount the root partition RW in the supervisor when you are changing it, and unmount it again after you are done. That is at least how our configuration scripts have been working until now.

Regarding corruption, mounting/unmounting each time you need to make a change would still make the system vulnerable to corruption during the timeframe it’s writable, so it shouldn’t pose an improvement over mounting as read/write from the get go. The extra overhead and complexity don’t provide much benefit in our experience, what benefit do you see with this approach?

As you say, it is about the timeframe it is writable. Shortening the timeframe when it is writable reduces the chance of corruption to only the moment you are writing to the disk. Which in practice should be close to never.

If you keep the filesystem rw, a rogue process might decide on writing something random to the boot partition at a critical moment. But the boot partition is essential for a device in stress, as when you get a hard reset the boot partition must be readable by uboot and must be healthy, otherwise the device might not come back at all.

By mounting the boot partition, you decide on the timeframe it is writable. You can take a well thought over decision on when you do this. For me, it will certainly be not on a moment you have strange device behaviour, and you are close to unplugging the device.

Hi Fokko, thanks for your feedback. I agree that reducing the time it’s writable would reduce the access time window, but it would also hide potential problems and make them more random and difficult to reproduce.
The boot partition is not accessible from application containers, only from the host OS, and if the host OS is making rogue accesses to the boot partition that is something we want to be aware of and fix.
We sometimes refer to this as fleet intelligence - balenaOS is used unmodified by hundreds of thousands of devices across all our fleets, that is a lot of field testing. And we would rather react quickly to such a problem and fix it for all than be unaware of the problem.

1 Like