Network creation loop (custom balena os)

Hi,

I built a custom balena os image because a needed the macvlan support.
The image seems to work fine, it does connect to the dashboard and i’m able to create a macvlan network from command line and attach a container to it. :ok_hand:
balena network create -d macvlan --subnet=172.16.86.0/24 --gateway=172.16.86.1 -o parent=eth0 pub_net

On the other hand, when I try to local push a multi container app (local push mode), the macvlan networks gets creates (visible with balena network list) but it gets removed ‘instantally’ in a loop.

balena cli logs:

[Debug]   Sending request to http://192.168.0.103:48484/v2/local/target-state
[Info]    Streaming device logs...
[Logs]    [5/1/2021, 12:22:38 PM] Removing network 'mymacvlan'
[Logs]    [5/1/2021, 12:22:39 PM] Creating network 'mymacvlan'
[Logs]    [5/1/2021, 12:22:39 PM] Removing network 'mymacvlan'
[Logs]    [5/1/2021, 12:22:40 PM] Creating network 'mymacvlan'
[Logs]    [5/1/2021, 12:22:41 PM] Removing network 'mymacvlan'
[Logs]    [5/1/2021, 12:22:41 PM] Creating network 'mymacvlan'
[Logs]    [5/1/2021, 12:22:42 PM] Removing network 'mymacvlan'
[Logs]    [5/1/2021, 12:22:43 PM] Creating network 'mymacvlan'
[Logs]    [5/1/2021, 12:22:43 PM] Removing network 'mymacvlan'
[Logs]    [5/1/2021, 12:22:44 PM] Creating network 'mymacvlan'
[Logs]    [5/1/2021, 12:22:45 PM] Removing network 'mymacvlan'
[Logs]    [5/1/2021, 12:22:45 PM] Creating network 'mymacvlan'

balena supervisor logs :

May 01 10:22:38 0cfe6de resin-supervisor[5393]: [api]     POST /v2/local/target-state 200 - 222.332 ms
May 01 10:22:38 0cfe6de resin-supervisor[5393]: [info]    Applying target state
May 01 10:22:38 0cfe6de resin-supervisor[5393]: [event]   Event: Network removal {}
May 01 10:22:39 0cfe6de resin-supervisor[5393]: [event]   Event: Network creation {}
May 01 10:22:39 0cfe6de resin-supervisor[5393]: [event]   Event: Network removal {}
May 01 10:22:40 0cfe6de resin-supervisor[5393]: [event]   Event: Network creation {}
May 01 10:22:41 0cfe6de resin-supervisor[5393]: [event]   Event: Network removal {}
May 01 10:22:41 0cfe6de resin-supervisor[5393]: [event]   Event: Network creation {}
May 01 10:22:42 0cfe6de resin-supervisor[5393]: [event]   Event: Network removal {}
May 01 10:22:43 0cfe6de resin-supervisor[5393]: [event]   Event: Network creation {}
May 01 10:22:43 0cfe6de resin-supervisor[5393]: [event]   Event: Network removal {}
May 01 10:22:44 0cfe6de resin-supervisor[5393]: [event]   Event: Network creation {}
May 01 10:22:45 0cfe6de resin-supervisor[5393]: [event]   Event: Network removal {}
May 01 10:22:45 0cfe6de resin-supervisor[5393]: [event]   Event: Network creation {}
May 01 10:22:46 0cfe6de resin-supervisor[5393]: [event]   Event: Network removal {}
May 01 10:22:47 0cfe6de resin-supervisor[5393]: [event]   Event: Network creation {}
May 01 10:22:47 0cfe6de resin-supervisor[5393]: [event]   Event: Network removal {}
May 01 10:22:48 0cfe6de resin-supervisor[5393]: [event]   Event: Network creation {}
May 01 10:22:49 0cfe6de resin-supervisor[5393]: [event]   Event: Network removal {}
May 01 10:22:49 0cfe6de resin-supervisor[5393]: [event]   Event: Network creation {}
May 01 10:22:50 0cfe6de resin-supervisor[5393]: [event]   Event: Network removal {}
May 01 10:22:50 0cfe6de resin-supervisor[5393]: [event]   Event: Network creation {}
May 01 10:22:51 0cfe6de resin-supervisor[5393]: [event]   Event: Network removal {}
May 01 10:22:52 0cfe6de resin-supervisor[5393]: [event]   Event: Network creation {}
May 01 10:22:52 0cfe6de resin-supervisor[5393]: [event]   Event: Network removal {}
May 01 10:22:53 0cfe6de resin-supervisor[5393]: [event]   Event: Network creation {}
May 01 10:22:54 0cfe6de resin-supervisor[5393]: [event]   Event: Network removal {}
May 01 10:22:54 0cfe6de resin-supervisor[5393]: [event]   Event: Network creation {}
May 01 10:22:55 0cfe6de resin-supervisor[5393]: [event]   Event: Network removal {}
May 01 10:22:55 0cfe6de resin-supervisor[5393]: [event]   Event: Network creation {}

(tested on v12.5.1 & 12.5.10).

Is there a way to output more debug logs in the supervisor ? What could explain that loop ?

I known i’m kind of broder line because of my custom image but any help would be very much appreciated.

Thanks

Hi

Can you share your device diagnostics logs? Perhaps there’s something there that helps us.
If there’s nothing there that helps us, I’ll ping our supervisor devs. There’s some network related checks that happen - I am wondering if that is coming in the way of this.

No sure which logs you want. So here are logs from dashboard diagnostic.

During my testing the network creation seems to work when I did not specify any driver_opts associated with the macvlan network but occurs when specified. I have opened up a PR (Fix passing driver_opts from compose to docker network creation by quentingllmt · Pull Request #1691 · balena-os/balena-supervisor · GitHub) but nor sure this is the fix nor the cause of my problem.

Device health checks

{
   "diagnose_version":"4.20.24",
   "checks":[
      {
         "name":"check_balenaOS",
         "success":false,
         "status":"balenaOS 2.x detected, but this version is not currently available in "
      },
      {
         "name":"check_container_engine",
         "success":true,
         "status":"No container_engine issues detected"
      },
      {
         "name":"check_localdisk",
         "success":true,
         "status":"No localdisk issues detected"
      },
      {
         "name":"check_memory",
         "success":true,
         "status":"83% memory available"
      },
      {
         "name":"check_networking",
         "success":false,
         "status":"Some networking issues detected: \ntest_upstream_dns: DNS lookup failed for  via upstream: 192.168.0.1\ntest_upstream_dns: DNS lookup failed for  via upstream: 8.8.8.8\ntest_balena_api: Could not contact \ntest_balena_registry: Could not communicate with registry2.balena-cloud.com for authentication"
      },
      {
         "name":"check_os_rollback",
         "success":true,
         "status":"No OS rollbacks detected"
      },
      {
         "name":"check_service_restarts",
         "success":true,
         "status":"No services are restarting unexpectedly"
      },
      {
         "name":"check_supervisor",
         "success":false,
         "status":"Supervisor is running, but may be unhealthy"
      },
      {
         "name":"check_temperature",
         "success":false,
         "status":"Some temperature issues detected: \ntest_current_temperature Temperature above 80C detected (/sys/class/thermal/thermal_zone4)"
      },
      {
         "name":"check_timesync",
         "success":true,
         "status":"Time is synchronized"
      }
   ]
}

I have sent to you the device diagnostics file in pm.

Thanks for getting back. And really appreciate taking the time to create a PR with the fix.

The team has approved the changes. Can you take a look at last comments (Fix passing driver_opts from compose to docker network creation by quentingllmt · Pull Request #1691 · balena-os/balena-supervisor · GitHub) to update commit msg and rebase? We should be able to get it out then.

Do reach out if you face any other problem after this patch?

The problem has been fixed by the PR ! (balena supervisor v12.6.8).

Thanks !