What labels are needed for Balena Wifi-Connect?

My wifi-connect works when not in a multi-container project being started by docker-compose.yml on BalenaOS.

What labels are needed to make it function the same as if it were a normal container?

  ap:
#    restart: always
    privileged: true
    build: ./ap
    labels:
      io.balena.features.dbus: '1'

The result of this is that it says it cannot bind on port 80 as it’s not available?

Hi @matthewcroughan, you can check out balenaDash or balena-fin-examples as examples of using WiFi connect in a multicontainer application. They both include the io.balena.features.firmware: '1' label as well as network_mode: host. I hope this helps!

This seems not to work still, I’ve added network_modeL host as well as the firmware label, the same result is true:

[Logs]    [12/11/2019, 23:49:37] [ap] Access point 'MING' created
[Logs]    [12/11/2019, 23:49:37] [ap] Starting HTTP server on 192.168.42.1:80
[Logs]    [12/11/2019, 23:49:37] [ap] Error: Cannot start HTTP server on '192.168.42.1:80': address not available
[Logs]    [12/11/2019, 23:49:40] Service exited 'ap sha256:50b26c8054530ea9cac67a292b84ca7c02db35037fbeda7655b7752d27f3cbd1'
[Logs]    [12/11/2019, 23:49:40] Killing service 'ap sha256:50b26c8054530ea9cac67a292b84ca7c02db35037fbeda7655b7752d27f3cbd1'
[Logs]    [12/11/2019, 23:49:40] Killed service 'ap sha256:50b26c8054530ea9cac67a292b84ca7c02db35037fbeda7655b7752d27f3cbd1'
[Logs]    [12/11/2019, 23:49:42] Installing service 'ap sha256:50b26c8054530ea9cac67a292b84ca7c02db35037fbeda7655b7752d27f3cbd1'
[Logs]    [12/11/2019, 23:49:43] Installed service 'ap sha256:50b26c8054530ea9cac67a292b84ca7c02db35037fbeda7655b7752d27f3cbd1'
[Logs]    [12/11/2019, 23:49:43] Starting service 'ap sha256:50b26c8054530ea9cac67a292b84ca7c02db35037fbeda7655b7752d27f3cbd1'
[Logs]    [12/11/2019, 23:49:46] Started service 'ap sha256:50b26c8054530ea9cac67a292b84ca7c02db35037fbeda7655b7752d27f3cbd1'
[Logs]    [12/11/2019, 23:49:46] [ap] Deleting already created by WiFi Connect access point connection profile: "MING"
[Logs]    [12/11/2019, 23:49:46] [ap] WiFi device: wlan0
[Logs]    [12/11/2019, 23:49:46] [ap] Access points: ["VM9606636", "VM3682359", "Virgin Media", "TALKTALK71D31C", "VM9778212", "VM037200-2G", "BTWifi-with-FON", "TALKTALK280A6D", "", "BTWifi-X", "SKY6462E", "SKY8BB7B"]
[Logs]    [12/11/2019, 23:49:46] [ap] Starting access point...
[Live]    Device state settled
[Logs]    [12/11/2019, 23:49:49] [ap] Access point 'MING' created
[Logs]    [12/11/2019, 23:49:49] [ap] Starting HTTP server on 192.168.42.1:80
[Logs]    [12/11/2019, 23:49:49] [ap] Error: Cannot start HTTP server on '192.168.42.1:80': address in use
[Logs]    [12/11/2019, 23:49:51] Service exited 'ap sha256:50b26c8054530ea9cac67a292b84ca7c02db35037fbeda7655b7752d27f3cbd1'

This also kills the device and means I have to Reflash BalenaOS as the supervisor is no longer responsive. balena ssh still works, but I cannot kill the AP container.

Once in the unit, the supervisor appears to be active, however I can’t do anything in the way of managing containers.

root@6ba4f52:~# balena ps
 CONTAINER ID        IMAGE                               COMMAND                  CREATED             STATUS                            PORTS                    NAMES
e8f77059944c        50b26c805453                        "/usr/bin/entry.sh b…"   8 minutes ago       Up 2 minutes                                               ap_6_1
3481d2ae7668        5e8bacd05a92                        "/run.sh"                11 minutes ago      Up 4 minutes                      0.0.0.0:80->3000/tcp     grafana_3_1
a8073c9a3ec1        6a37c6f5687f                        "/entrypoint.sh infl…"   11 minutes ago      Up 4 minutes                      0.0.0.0:8086->8086/tcp   influxdb_2_1
b512f9c778ca        0cd85513d75a                        "/usr/bin/tini -- /b…"   11 minutes ago      Up 4 minutes                      0.0.0.0:8888->8888/tcp   jupyterlab_5_1
4e0058e3a9b5        179c5547d110                        "/docker-entrypoint.…"   11 minutes ago      Up 4 minutes                      0.0.0.0:1883->1883/tcp   mosquitto_4_1
e5bb1c7277e2        9425211fe9fd                        "npm start -- --user…"   11 minutes ago      Up 4 minutes                      0.0.0.0:1880->1880/tcp   nodered_1_1
ce1402752398        balena/armv7hf-supervisor:v9.11.3   "./entry.sh"             49 years ago        Up 4 minutes (health: starting)                            resin_supervisor
root@6ba4f52:~# time balena kill ap_6_1
Error response from daemon: Cannot kill container: ap_6_1: Cannot kill container e8f77059944c0dc3825cb8e009ff67a4092b2850130901ce6f227350808432a5: connection error: desc = "transport: dial unix /var/run/balena-engine/containerd/balena-engine-containerd.sock: connect: connection refused": unknown

@matthewcroughan it’s worth pursuing the error message that port 80 is already in use. I can see from the list of containers you’ve posted, you have a grafana container which is bound to port 80 and hence preventing WiFi connect binding to that port.

You seem to have a lot of containers running on your device when you should only have the supervisor and one other if all you’re trying to use is WiFi connect, unless you did not share the full docker-compose.yml file and these other services listed are in fact correct, in which case the port 80 conflict is valid and you’ll need to reconfigure Grafana to run on a different port.

Of course, I just realised this 10 minutes ago, I’m a complete idiot!

However, the supervisor indeed doesn’t like this and it does leave the device in a state wherein I cannot ever use it again until I reflash the card, which is still in some way a failing of the balena engine, it shouldn’t be able to end up in this state.

After resolving this and reflashing the disk, it’s all fine.

Here’s the full compose:

version: '2'
volumes:
    influxdb-data:
    nodered-data:
    grafana-data:
    jupyterlab-data:
services:
  nodered:
    restart: always
    build: ./nodered
    volumes:
      - 'nodered-data:/data'
    ports:
      - "1880:1880"
  influxdb:
    restart: always
    build: ./influxdb
    environment:
      - INFLUXDB_DB=ming_default
    volumes:
      - 'influxdb-data:/data'
    ports:
      - "8086:8086"
  grafana:
    restart: always
    build: ./grafana
    environment:
      - GF_AUTH_ANONYMOUS_ENABLED=true
      - GF_AUTH_ANONYMOUS_ORG_ROLE=Viewer
    ports:
      - "3000:3000"
    volumes:
      - 'grafana-data:/data'
    depends_on:
      - influxdb
  mosquitto:
    restart: always
    build: ./mosquitto
    ports:
      - "1883:1883"
  jupyterlab:
    restart: always
    build: ./jupyterlab
    ports:
      - "8888:8888"
    volumes:
      - 'jupyterlab-data:/data'
  ap:
    build: ./ap
    restart: always
    network_mode: host
    privileged: true
    labels:
      io.balena.features.dbus: '1'
      io.balena.features.firmware: '1'

@chrisys Now that it’s working, I wanted to have this script determine whether or not the AP should be created.

MING_AP=0

# If MING_AP is set and equal to 1 then create an access point.
# Otherwise echo the fact that it has not been set to the console.

# We should endeavour to make this do-able from the captive portal

if [[ "$MING_AP" == 1 ]]; then
  ./wifi-connect -s MING -u /usr/src/app/ui
elif [[ "$MING_AP" == 0 ]]
then
  echo "Wifi AP has not been enabled in service variables"
fi

However, I think there’s a bug in the Balena Engine, because it cannot kill or restart this container.

However, if I enter the container and issue kill -9 to the PID of wifi-connect it will succeed.

No other method by the supervisor can kill the AP service, though I can enter it and kill it manually. Any advice on how to solve this now without waiting for a fix in the balena engine?

Actually, no matter what I do here it seems that the AP just destroys the state of the device, no longer able to push things to it as it just stalls and pretends nothing is happening.

Hi @matthewcroughan,

Actually, no matter what I do here it seems that the AP just destroys the state of the device, no longer able to push things to it as it just stalls and pretends nothing is happening.

Nothing in the app should ever cause the supervisor to fail/stop responding/require a reflash of the image.

A way to check if the supervisor is running is to check its logs from the HostOS.

journalctl -u resin-supervisor. It’ll show some error messages I guess.

I note the supervisor version v9.11.3. The OS is somewhere around v2.31.5 I guess. For new work, I’d recommend updating to a new OS.

Now that it’s working, I wanted to have this script determine whether or not the AP should be created.

I’m a bit unclear about which container is running the script. And where this environment variable is being set/unset from?

Regards
ZubairLK