Balena Preload on Github Actions

I am trying to build a CICD workflow on Github Actions to preload a base image.

Most balena cli commands run flawlessly, but I am struggling with the balena preload command.
It builds and then spins up a docker container to run the preloading. In order to build it correctly the .img file that we want to modify needs to be at a path accessible to both the job step running the balena preload command, as well as the docker instance that will do the building.

In order to manage that I needed to add a symlink inside the balena preload step so that I can use the Github runner base path to refer to the image.

The issue I am running into is that the balena preload commands spins up the container and fails to connect to ports. The builder itself is running in network=host, but I don’t quite know how the preload container is being spun up. @acostach Could you shed some light?

The github action looks like this:

name: Configure and preload a balena image
on:
  workflow_dispatch:
    inputs:
      application:
        description: "balena application to create for"
        required: true
        default: "staging-armv8-l4t-t186"
      device_type:
        description: "balena device_type to create for"
        required: true
        default: "jetson-xavier-nx-devkit"
      os_version:
        description: "balena os version to create for"
        required: true
        default: "latest"
jobs:
  docker:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v2
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v1
        with:
          install: true
          driver-opts: network=host
      - name: Login to DockerHub
        uses: docker/login-action@v1
        with:
          registry: ghcr.io
          username: ${{ github.repository_owner }}
          password: ${{ secrets.GITHUB_TOKEN }}
      - name: Balena OS download
        uses: aivero/balena-cli-action@master
        if: success()
        timeout-minutes: 5
        with:
          balena_api_token: ${{ secrets.BALENA_TOKEN }}
          balena_command: balena os download ${{ github.event.inputs.device_type }} -o ${{ github.event.inputs.device_type }}_${{ github.event.inputs.os_version }}.img
      - name: Balena OS configure
        uses: aivero/balena-cli-action@master
        if: success()
        timeout-minutes: 10
        with:
          balena_api_token: ${{ secrets.BALENA_TOKEN }}
          balena_command: balena os configure ${{ github.event.inputs.device_type }}_${{ github.event.inputs.os_version }}.img -a ${{ github.event.inputs.application }} --device-type ${{ github.event.inputs.device_type }} --config-network ethernet
      - name: Balena preload from application
        uses: aivero/balena-cli-action@master
        if: success()
        timeout-minutes: 25
        with:
          balena_api_token: ${{ secrets.BALENA_TOKEN }}
          balena_command: |
            ls ${{ github.workspace }}
            mkdir -p ${{ github.workspace }}
            ln -s /github/workspace/${{ github.event.inputs.device_type }}_${{ github.event.inputs.os_version }}.img -T ${{ github.workspace }}/${{ github.event.inputs.device_type }}_${{ github.event.inputs.os_version }}.img
            balena preload ${{ github.workspace }}/${{ github.event.inputs.device_type }}_${{ github.event.inputs.os_version }}.img -a ${{ github.event.inputs.application }} -c current --debug
      - name: Tar image
        run: |
          tar cf - ${{ github.event.inputs.device_type }}_${{ github.event.inputs.os_version }}.img -P | pv -s $(du -sb ${{ github.event.inputs.device_type }}_${{ github.event.inputs.os_version }}.img | awk '{print $1}') | bzip2 > ${{ github.event.inputs.device_type }}_${{ github.event.inputs.os_version }}.img.tar.bz2
      - 
        name: Store conan deployed 
        uses: actions/upload-artifact@v2
        if: ${{ success() }}
        with:
          name: balena_image_configured_${{ github.event.inputs.device_type }}_${{ github.event.inputs.os_version }}
          path: |
            *.img.tar.bz2

The error I am getting is:

...
Building Docker preloader image. [========================] 100%
Waiting for Docker to start...
Docker started
Leaving splash image alone
Expanding partition n°13 of /img/balena.img
Resizing ext4 filesystem of partition n°13 of /img/balena.img using /dev/loop5
File system OK
Waiting for Docker to start...
Docker started

error Error: connect ECONNREFUSED 0.0.0.0:45933
    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1141:16)
From previous event:
    at DockerToolbelt.version (/snapshot/versioned-source/node_modules/dockerode/lib/docker.js:1243:12)
    at DockerProgress.getProgressReporter (/snapshot/versioned-source/node_modules/balena-preload/node_modules/docker-progress/index.js:362:37)
    at DockerProgress.pullProgress (/snapshot/versioned-source/node_modules/balena-preload/node_modules/docker-progress/index.js:421:19)
    at DockerProgress.pull (/snapshot/versioned-source/node_modules/balena-preload/node_modules/docker-progress/index.js:403:32)
    at /snapshot/versioned-source/node_modules/balena-preload/build/preload.js:777:44
From previous event:
    at /snapshot/versioned-source/node_modules/balena-preload/build/preload.js:776:29
    at processImmediate (internal/timers.js:456:21)
    at process.topLevelDomainCallback (domain.js:137:15)
From previous event:
    at Preloader.preload (/snapshot/versioned-source/node_modules/balena-preload/build/preload.js:765:14)
    at PreloadCmd.prepareAndPreload (/snapshot/versioned-source/build/commands/preload.js:305:25)
    at processTicksAndRejections (internal/process/task_queues.js:97:5)
    at async PreloadCmd.run (/snapshot/versioned-source/build/commands/preload.js:118:13)
    at async PreloadCmd._run (/snapshot/versioned-source/node_modules/@oclif/command/lib/command.js:43:20)
    at async Config.runCommand (/snapshot/versioned-source/node_modules/@oclif/config/lib/config.js:175:24)
    at async CustomMain.run (/snapshot/versioned-source/node_modules/@oclif/command/lib/main.js:27:9)
    at async CustomMain._run (/snapshot/versioned-source/node_modules/@oclif/command/lib/command.js:43:20)
    at async Promise.all (index 1)
    at async oclifRun (/snapshot/versioned-source/build/app.js:75:5)
    at async Object.run (/snapshot/versioned-source/build/app.js:88:9) {
  errno: 'ECONNREFUSED',
  code: 'ECONNREFUSED',
  syscall: 'connect',
  address: '0.0.0.0',
  port: 45933
}
error Error: connect ECONNREFUSED 0.0.0.0:45933
    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1141:16)
From previous event:
    at DockerToolbelt.version (/snapshot/versioned-source/node_modules/dockerode/lib/docker.js:1243:12)
    at DockerProgress.getProgressReporter (/snapshot/versioned-source/node_modules/balena-preload/node_modules/docker-progress/index.js:362:37)
    at DockerProgress.pullProgress (/snapshot/versioned-source/node_modules/balena-preload/node_modules/docker-progress/index.js:421:19)
    at DockerProgress.pull (/snapshot/versioned-source/node_modules/balena-preload/node_modules/docker-progress/index.js:403:32)
    at /snapshot/versioned-source/node_modules/balena-preload/build/preload.js:777:44
From previous event:
    at /snapshot/versioned-source/node_modules/balena-preload/build/preload.js:776:29
    at processImmediate (internal/timers.js:456:21)
    at process.topLevelDomainCallback (domain.js:137:15)
From previous event:
    at Preloader.preload (/snapshot/versioned-source/node_modules/balena-preload/build/preload.js:765:14)
    at PreloadCmd.prepareAndPreload (/snapshot/versioned-source/build/commands/preload.js:305:25)
    at processTicksAndRejections (internal/process/task_queues.js:97:5)
    at async PreloadCmd.run (/snapshot/versioned-source/build/commands/preload.js:118:13)
    at async PreloadCmd._run (/snapshot/versioned-source/node_modules/@oclif/command/lib/command.js:43:20)
    at async Config.runCommand (/snapshot/versioned-source/node_modules/@oclif/config/lib/config.js:175:24)
    at async CustomMain.run (/snapshot/versioned-source/node_modules/@oclif/command/lib/main.js:27:9)
    at async CustomMain._run (/snapshot/versioned-source/node_modules/@oclif/command/lib/command.js:43:20)
    at async Promise.all (index 1)
    at async oclifRun (/snapshot/versioned-source/build/app.js:75:5)
    at async Object.run (/snapshot/versioned-source/build/app.js:88:9) {
  errno: 'ECONNREFUSED',
  code: 'ECONNREFUSED',
  syscall: 'connect',
  address: '0.0.0.0',
  port: 45933
}

ECONNREFUSED: connect ECONNREFUSED 0.0.0.0:45933

Error: connect ECONNREFUSED 0.0.0.0:45933
    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1141:16)
From previous event:
    at DockerToolbelt.createImage (/snapshot/versioned-source/node_modules/dockerode/lib/docker.js:109:12)
    at DockerToolbelt.pull (/snapshot/versioned-source/node_modules/dockerode/lib/docker.js:1376:27)
    at DockerProgress.pull (/snapshot/versioned-source/node_modules/balena-preload/node_modules/docker-progress/index.js:405:26)
    at /snapshot/versioned-source/node_modules/balena-preload/build/preload.js:777:44
From previous event:
    at /snapshot/versioned-source/node_modules/balena-preload/build/preload.js:776:29
    at processImmediate (internal/timers.js:456:21)
    at process.topLevelDomainCallback (domain.js:137:15)
From previous event:
    at Preloader.preload (/snapshot/versioned-source/node_modules/balena-preload/build/preload.js:765:14)
    at PreloadCmd.prepareAndPreload (/snapshot/versioned-source/build/commands/preload.js:305:25)
    at processTicksAndRejections (internal/process/task_queues.js:97:5)
    at async PreloadCmd.run (/snapshot/versioned-source/build/commands/preload.js:118:13)
    at async PreloadCmd._run (/snapshot/versioned-source/node_modules/@oclif/command/lib/command.js:43:20)
    at async Config.runCommand (/snapshot/versioned-source/node_modules/@oclif/config/lib/config.js:175:24)
    at async CustomMain.run (/snapshot/versioned-source/node_modules/@oclif/command/lib/main.js:27:9)
    at async CustomMain._run (/snapshot/versioned-source/node_modules/@oclif/command/lib/command.js:43:20)
    at async Promise.all (index 1)
    at async oclifRun (/snapshot/versioned-source/build/app.js:75:5)
    at async Object.run (/snapshot/versioned-source/build/app.js:88:9)

Hello, are you able to get the output of /proc/filesystems from the builder(s) you are running this on?

The docker daemon is failing to start/communicate correctly in the following function:

Can you make sure the value of DOCKER_HOST is set to the dockerd socket of the runner, and make sure it’s available to the container?

  • If using TCP and host networking you might need something like tcp://0.0.0.0:2375 assuming the runner docker socket is actually listening on TCP which is not the default.
  • Otherwise you can try unix:///var/run/docker.sock and expose /var/run/docker.sock:/var/run/docker.sock as a volume to the container.

Also could you provide the CLI version used in the aivero/balena-cli-action helper?

Thanks for the quick answer @ab77 and @klutchell

This was run on v12.40.0.

I will try again with the current 12.44 and will try to specify the DOCKER_HOST - @klutchell This is an environment variable, right?

@ab77 Will check for /proc/filesystem

Just reran with 12.44.23 and setting the DOCKER_HOST: unix:///var/run/docker.sock which is also bind mount into the container.:

...
      - name: Balena preload from application
        uses: aivero/balena-cli-action@master
        if: success()
        timeout-minutes: 25
        env:
          DOCKER_HOST: unix:///var/run/docker.sock
        with:
          balena_api_token: ${{ secrets.BALENA_TOKEN }}
          balena_command: |
            mkdir -p ${{ github.workspace }}
            ln -s /github/workspace/${{ github.event.inputs.device_type }}_${{ github.event.inputs.os_version }}.img -T ${{ github.workspace }}/${{ github.event.inputs.device_type }}_${{ github.event.inputs.os_version }}.img
            ls -alh /proc/filesystem
            balena preload ${{ github.workspace }}/${{ github.event.inputs.device_type }}_${{ github.event.inputs.os_version }}.img -a ${{ github.event.inputs.application }} -c current --debug
...

@ab77 I cannot access /proc/filesystem inside the preload job.:

ls: cannot access '/proc/filesystem': No such file or directory

@klutchell
No change here:

eaving splash image alone
Expanding partition n°13 of /img/balena.img
Resizing ext4 filesystem of partition n°13 of /img/balena.img using /dev/loop5
File system OK
Waiting for Docker to start...
Docker started


ECONNREFUSED: connect ECONNREFUSED 0.0.0.0:44797

Error: connect ECONNREFUSED 0.0.0.0:44797
    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1107:14)
From previous event:
    at preload._getState.then.then.then.then (/snapshot/versioned-source/node_modules/balena-preload/build/preload.js:816:29)
    at runCallback (timers.js:705:18)
    at tryOnImmediate (timers.js:676:5)
    at processImmediate (timers.js:658:5)
    at process.topLevelDomainCallback (domain.js:126:23)
From previous event:
    at Preloader.preload (/snapshot/versioned-source/node_modules/balena-preload/build/preload.js:803:14)
    at PreloadCmd.prepareAndPreload (/snapshot/versioned-source/build/commands/preload.js:313:25)
    at process._tickCallback (internal/process/next_tick.js:68:7)

So it appears as if the building of the preload container works fine, however, spinning it up fails.

Try adding -P and/or --dockerHost|--docklerPort to your preload command. Perhaps see if you can just run docker ps in your pipeline to get a connection to the API and if that works, see if you can get docker info out of it to see what storage drivers are supported:

$ docker info | grep Storage
 Storage Driver: overlay2

@ab77 @klutchell
I tried the -P or --dockerHost,

but it appears as if the balena cli handles these incorrectly.
Note that

  env:
    DOCKER_HOST: /var/run/docker.sock

Instead of overriding the docker host as I would have expected, it seems to get appended:

balena preload /home/runner/work/PATH_TO_SOME_image.img -a MY_APPLICATION -c current --debug -dockerHost $DOCKER_HOST 
"
docker: Cannot connect to the Docker daemon at tcp://localhost:2375/var/run/docker.sock. Is the docker daemon running?.

I just checked the docs and it looks like for a volume mounted socket like that you can just use the --docker flag.

balena preload /home/runner/work/PATH_TO_SOME_image.img -a MY_APPLICATION -c current --debug --docker /var/run/docker.sock

If you want to use the DOCKER_HOST env var it needs to be set in the format of a socket, so like DOCKER_HOST=unix:///var/run/docker.sock but I wouldn’t pass this value directly to the balena CLI.

I had a look at the action you’re using and I think docker-in-docker might not be a supported scenario.
You’re using run: using: docker : balena-cli-action/action.yml at master · aivero/balena-cli-action · GitHub
But e.g. the official docker build step runs natively: build-push-action/action.yml at master · docker/build-push-action · GitHub

I’m not sure if there is a way to make the first one work, but I’d try the latter