Boot once but not twice (but sometimes)

I’am working on running Balena on STM32MP. Got it working on SD-card and almost on emmc. It boots once but not twice. After some reboots from the watchdog it works. I guess the watchdog is set to 10 sec. What is taking so long time after first boot?

[ 2.196116] can: broadcast manager protocol (rev 20170425 t)
[ 2.200881] mmcblk1rpmb: mmc1:0001 004GA0 partition 3 512 KiB, chardev (244:0)
[ 2.206552] Key type dns_resolver registered
[ 2.216492] ThumbEE CPU extension supported.
[ 2.227816] Registering SWP/SWPB emulation handler
[ 2.238225] registered taskstats version 1
[ 2.240973] Loading compiled-in X.509 certificates
[ 2.243187] mmcblk1: p1 p2 p3 p4 p5 p6
[ 2.265734] stm32_rtc 5c004000.rtc: setting system clock to 2000-01-01 02:02:37 UTC (946692157)
[ 2.273879] usb33: disabling
[ 2.275844] ALSA device list:
[ 2.278787] No soundcards found.
[ 2.291609] usb 1-1: New USB device found, idVendor=0bda, idProduct=b82c, bcdDevice= 2.10
[ 2.297927] Freeing unused kernel memory: 9216K
[ 2.298349] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 2.310337] usb 1-1: Product: 802.11ac NIC
[ 2.314002] Checked W+X mappings: passed, no W+X pages found
[ 2.314065] usb 1-1: Manufacturer: Realtek
[ 2.319743] Run /init as init process
[ 2.323789] usb 1-1: SerialNumber: 123456
starting version 239
[ 3.270240] random: fast init done
[ 4.194556] zram0: detected capacity change from 0 to 113094656 <— It most often reboot after this message.
[ 13.436701] random: crng init done <— But this time it worked as next step is under 10 sec
[ 14.162740] EXT4-fs (mmcblk1p3): mounted filesystem with ordered data mode. Opts: (null)
[ 14.225703] EXT4-fs (mmcblk1p5): mounted filesystem with ordered data mode. Opts: (null)
[ 14.899866] EXT4-fs (mmcblk1p3): re-mounted. Opts: (null)
[ 15.312358] systemd[1]: System time before build time, advancing clock.

Welcome to balenaOS 2.58.0!

[ OK ] Created slice system-getty.slice.
[ OK ] Listening on udev Control Socket.
[ OK ] Created slice system-resin\x2dinfo.slice.
[ OK ] Reached target Remote File Systems.
[ OK ] Listening on udev Kernel Socket.

How do I troubleshoot this?

I would suggest increasing the kernel log level. This should potentially reveal the root of the problem.

Console output is already 7.

cat /proc/sys/kernel/printk
7 4 1 7

Hi, let me set some common ground as this is a platform we don’t support. From your message above, I understand the bootloader is initializing a hardware watchdog with a 10 seconds timeout, right? What is then supposed to keep alive the watchdog, systemd?

During the boot process there is an initramfs that perform several tasks like disk checking and also mounting and pivot rooting the final rootfs before systemd is launched. Ten seconds is not going to be enough time to cover for example a long filesystem check. If enabling the watchdog in the bootloader is a requirement, I would recommend to extend the time to cover the worst use case, like for example having to automatically repair the data partition.

Anyway, if you want to debug the boot process, I would suggest you add a shell-debug entry to the kernel command line argument. More details about initramfs debug options can be found at debug « initramfs-framework « initrdscripts « recipes-core « meta - poky - Poky Build Tool and Metadata

I checked my watchdog by running:

fdt addr ${fdtcontroladdr}
fdt list /soc/watchdog

… it was set to 320 sec but it restarts after 32 sec. So where do I find that setting?

It seams like there is a bug in the watchdog driver that restart the device.

A workaround can be found here: https://community.st.com/s/question/0D50X0000BL8wcRSQR/linux-watchdog-driver-is-broken