Unit failures with: balena-hostname-conf.service, balena-hostname-conf.service, balena-engine.socket

Hi,

I’m wondering if anyone might be able to help me debug an uncommon failure I’ve got with
Balena OS and our ARM board (TM3). It happened approximately once out of 4048 reboots.

Very occasionally we noticed the hostname wasn’t being set correctly by the Balena services, it was set to the kernel compiled hostname, H6.

After looking a bit more into this, I noticed some of the Balena services weren’t running correctly -

root@H6:~# systemctl list-units --failed
  UNIT                         LOAD   ACTIVE SUB    DESCRIPTION                         
● balena-hostname-conf.service loaded failed failed balena-hostname.json watcher service
● balena-hostname.service      loaded failed failed Balena Hostname Configuration       
● balena-engine.socket         loaded failed failed Docker Socket for the API

I noticed by doing journalctl -u balena-hostname.service, I found:

Oct 30 08:18:56 H6 balena-hostname[1387369]: [balena-hostname][INFO] Setting hostname.
Oct 30 08:18:56 H6 balena-hostname[1387369]: [balena-hostname][INFO] Generating default based on short UUID.
Oct 30 08:18:56 H6 balena-hostname[1387369]: [balena-hostname][ERROR] UUID missing from config.json.
Oct 30 08:18:56 H6 systemd[1]: balena-hostname.service: Main process exited, code=exited, status=1/FAILURE

I noticed though ‘/mnt/boot/config.json’ does have a UUID in.

The particular image being used was built with ./balena-yocto-scripts/build/barys -m tm3 -d

Any pointers would be much appreciated!

I’ve spent a while looking more into this:

I think the issue is related to /usr/bin/balena-hostname.

if [ -z "$CONFIG_HOSTNAME" ]; then
    # take just the first 7 characters
    info "Generating default based on short UUID."
    CONFIG_HOSTNAME=$(echo "$UUID" | sed -e 's/\(.......\).*/\1/')
fi

if [ -z "$CONFIG_HOSTNAME" ]; then
    fail "UUID missing from config.json."
fi

I get ‘fail “UUID missing from config.json.”’ triggered because UUID isn’t being set.

In balena-config-vars I’ve looked what is being executed -

if [ "${USE_CACHE}" -eq "1" ] && [ -n "${BALENA_CONFIG_VARS_CACHE}" ] && [ -f "${BALENA_CONFIG_VARS_CACHE}" ]; then
        . "${BALENA_CONFIG_VARS_CACHE}"  <--- ends up here
else
    [ -n "${BALENA_CONFIG_VARS_CACHE}" ] && [ -f "${BALENA_CONFIG_VARS_CACHE}" ] && rm -f "${BALENA_CONFIG_VARS_CACHE}"

So it tries to execute - /var/cache/balena-config-vars

However I notice that file appears empty, which I think may be the root of these issues

But /mnt/boot/config.json does contain a UUID