Discover avahi / zeroconf services inside container

Hello!,
I have one raspberrypi 4 running balena.os (local mode) that provides a python server that I would like to announce on the network.

As I’m using python I have set a zeroconf service using zeroconf.py and I advertise the service on the network and I’m capable of discover it with any pc using the same zeroconf.py library and a small script.

Now I would like to set a second raspberry pi (in this case a Zero one) that will run also a balena.os based deployment and it will also run a python script that in this case will try to discover the previous zeroconf service. I have used the same test code to run the zeroconf discover service that I’m using on the network and it doesn’t resolve:

    from zeroconf import ServiceBrowser, Zeroconf
    import ipaddress

    class MyListener:
        def remove_service(self, zeroconf, type, name):
            print("Service %s removed" % (name,))

        def add_service(self, zeroconf, type, name):
            info = zeroconf.get_service_info(type, name)
            print(str(ipaddress.IPv4Address(info.addresses[0])))
            print("Service %s added, service info: %s" % (name, info))


    if __name__ == "__main__":
        zeroconf = Zeroconf()
        listener = MyListener()
        browser = ServiceBrowser(zeroconf, "_archerytimer._tcp.local.", listener)
        try:
            input("Press enter to exit...\n\n")
        finally:
            zeroconf.close()

I have added network_mode = host on both raspis (I already use it on the raspi4) without luck

Do I need to add anything else to the raspizero so it can discover the zeroconf?

After searching on balena forums I have found a reference to this project:

As the zeroconf.py library doesn’t use d-bus I’m thinking of switching from the it to use the dbus implementation, but the previous project shows how to expose a service but not how to discover it.
(However it is strange that the raspi 4 is capable of sucessfully expose the service on the network)…

Is there any example of how to use the d-bus to discover services?
Thank you in advance.
Gabriel

Hello!

So, the first thing to note is that for both publishing services and discovering them, your services do need to use host networking as you’ve described, as multicast traffic won’t be forwarded over the balenaEngine network bridge.

It sounds like you have service publishing working correctly, and just as another example, although the service itself is not python, here’s an example of how you could use Avahi from within a service container itself to publish services. The key points with using Zeroconf (MDNS/DNS-SD) in a service container is to include the Avahi daemon (libnss_mdns and variants) and use host networking.

To actually discover services, it looks like the python-zeroconf library (is this the one you’re currently using?) will allow you to do this with the add_service_listener, check_service and get_service_info API calls. Again, you’ll need to install the Avahi daemon in the service container and use host networking to carry out service discovery.

You can use DBus to use the host instance of Avahi as well if you wish, but you’ll need an appropriate DBus python library for this (unfortunately I’m not familiar with a relevant library).

Hopefully this helps!

Best regards,

Heds

Hello,
Thanks for the explanation.
The library I’m using is the zeroconf one:


This library is being used on the raspi 4 (with network_mode = true) to expose the service using this code:

#!/usr/bin/env python3

""" Example of announcing a service (in this case, a fake HTTP server) """

import argparse

import socket
from settings import settings

from zeroconf import ServiceInfo, Zeroconf

from settings.settings import logger

desc = {'version':"0.70.0",'name':'ArcheryTimer'}
info = ServiceInfo(
        "_archerytimer._tcp.local.",
        "Server._archerytimer._tcp.local.",
        addresses=[socket.inet_aton(settings.IP)],
        port=8090,
        properties=desc,
        server="ArcheryTimer.local.",
    )

zeroconf = Zeroconf()
def stop_zeroconf():
      """Stop Zeroconf."""
      logger.info("Stopping Zeroconf...")
      zeroconf.unregister_service(info)
      zeroconf.close()

def register_zeroconf():
    logger.info("   Registering service...")
    zeroconf.register_service(info)
    logger.info("   Registration done.")

And it is working properly as I can discover such service on the network without problem using any client.
I also tested outside a balena container the code to discover the service:

from zeroconf import ServiceBrowser, Zeroconf
import ipaddress

class MyListener:
    def remove_service(self, zeroconf, type, name):
        print("Service %s removed" % (name,))

    def add_service(self, zeroconf, type, name):
        info = zeroconf.get_service_info(type, name)
        print(str(ipaddress.IPv4Address(info.addresses[0])))
        print("Service %s added, service info: %s" % (name, info))


if __name__ == "__main__":
    zeroconf = Zeroconf()
    listener = MyListener()
    browser = ServiceBrowser(zeroconf, "_archerytimer._tcp.local.", listener)
    try:
        input("Press enter to exit...\n\n")
    finally:
        zeroconf.close()

And it is working properly in all places but the balena container.
So the problem is not publish the service, but being able to discover it inside a python program runining in a balena container.

Hi,

Thanks for the response. I’m going to test this here by having one device use the example project I referenced and another including the Avahi toolset to see if it will discover them. Could you let me know which version of balenaOS you’re using for the RPi4 and RPi3s?

Many thanks,

Heds

Thanks for testing it!!

I’m using the BalenaOS 2.41.0+rev4 for the raspi4, and the BalenaOS 2.32.0+rev1 for the raspi Zero W,
In the raspi Zero I’m using also the wifi connect.
Let me know if I can provide / test anything else.

Thanks very much for the detailed OS info. I’ll get back to you as soon as I have some more information to share!

Best regards,

Heds

Hello again!

Unfortunately, I don’t have a RPi Zero W to test on, but I’ve just carried this out using the example I gave to publish the service (on a Pi4), and used a Pi3 with a very simple application to discover the service. I used a modified version of the Dockerfile from our publishing example for discovery, including the Avahi daemon and utilities for the Rpi3. This was carried out a single service, so not using a docker-compose.yml manifest but just the Dockerfile/entry script:

FROM balenalib/%%BALENA_MACHINE_NAME%%-debian-node:10-buster

ENV container docker

RUN apt-get update && apt-get install -y --no-install-recommends \
        dbus \
        avahi-daemon \
        avahi-discover \
        avahi-utils \
        libnss-mdns \
        mdns-scan \
        systemd \
    && rm -rf /var/lib/apt/lists/*

RUN systemctl mask \
    dev-hugepages.mount \
    sys-fs-fuse-connections.mount \
    sys-kernel-config.mount \
    display-manager.service \
    getty@.service \
    systemd-logind.service \
    systemd-remount-fs.service \
    getty.target \
    graphical.target

RUN echo again

COPY entry.sh /usr/bin/entry.sh

STOPSIGNAL 37
ENTRYPOINT ["/usr/bin/entry.sh"]

CMD ["echo", "started"]

and where entry.sh is:

#!/bin/bash
set -m

GREEN='\033[0;32m'
echo -e "${GREEN}Systemd init system enabled."

# systemd causes a POLLHUP for console FD to occur
# on startup once all other process have stopped.
# We need this sleep to ensure this doesn't occur, else
# logging to the console will not work.
sleep infinity &
for var in $(compgen -e); do
        printf '%q=%q\n' "$var" "${!var}"
done > /etc/docker.env
exec /lib/systemd/systemd

I ensured I could see the RPi4s published service from my local development machines (both macOS and Linux), then SSHd into the main service on the Rpi3 from the balenaCloud dashboard, and then ran avahi-browse in that service container:

root@8ed6630:/# avahi-browse -r _zoo._tcp
+   eth0 IPv6 Zoo Animal Spotter                            _zoo._tcp            local
+   eth0 IPv4 Zoo Animal Spotter                            _zoo._tcp            local
=   eth0 IPv6 Zoo Animal Spotter                            _zoo._tcp            local
   hostname = [2f02fca.local]
   address = [fe80::dfbf:c11b:8e5f:64fe]
   port = [4567]
   txt = []
=   eth0 IPv4 Zoo Animal Spotter                            _zoo._tcp            local
   hostname = [2f02fca.local]
   address = [192.168.1.175]
   port = [4567]
   txt = []
Got SIGINT, quitting.
root@8ed6630:/# curl -XGET 2f02fca.local:4567
A Zookeeper has been spotted!

So, this is what I would expect, and it is showing that the service container with host networking is correctly able to find the published service. Unfortunately, at this point, it’s starting to sound like an issue with the Python zeroconf library. It would be really handy if you could run this test on your Pi Zero W, as that would confirm this.

Best regards,

Heds

Thank you again to looking at this.
I will test the examples you provide.
In the meantime I have reviewing the zerconf library that I’m using and it is using Multicast UDP to advertise / discover avahi services. So maybe is just a problem of being able to listen to multicast udp on the container. I wonder if it could be just a port mapping issue?
Regards

There should be no problem using Multicast UDP inside a container. Can you please provide your docker-compose file so we can take a look for any issues there?

Hello,
The project for the moment is heavely based on the wifi connect example and I’m starting to adapt to my needs. This is the template:

FROM balenalib/%%BALENA_MACHINE_NAME%%-debian-python:3.7.4

# Set the maintainer
LABEL maintainer="Joe Roberts <joe@resin.io>, Zahari Petkov <zahari@resin.io>"

# Enable systemd init system
ENV INITSYSTEM on

# Set the working directory
WORKDIR /usr/src/app

# We have split up the resin-wifi-connect and Display-O-Tron HAT configuration to make clear
# the different parts needed. In your dockerfile you should combine these steps to reduce
# the number of layers.

# -- Start of resin-wifi-connect section -- #

# Set the device type environment variable using Dockerfile templates
ENV DEVICE_TYPE=%%BALENA_MACHINE_NAME%%

# Use apt-get to install dependencies
RUN apt-get update && apt-get install -yq --no-install-recommends \
    dnsmasq && \
    apt-get clean && rm -rf /var/lib/apt/lists/*

RUN apt-get update && apt-get -qq -y install curl

# Install resin-wifi-connect
RUN curl https://api.github.com/repos/balena-io/wifi-connect/releases/latest -s \
    | grep -hoP 'browser_download_url": "\K.*%%RESIN_ARCH%%\.tar\.gz' \
    | xargs -n1 curl -Ls \
    | tar -xvz -C /usr/src/app/

# -- End of resin-wifi-connect section -- #

# # -- Start of Display-O-Tron HAT section -- #

# # Use apt-get to install dependencies
RUN apt-get update && apt-get install -yq --no-install-recommends \
#    python-dev \
#    python-smbus \
#    python-psutil \
    wireless-tools && \
    apt-get clean && rm -rf /var/lib/apt/lists/*

# Upgrade pip
RUN pip install --upgrade pip
COPY requirements.txt .
RUN pip install --user -r requirements.txt --no-cache-dir --disable-pip-version-check \
                --index-url https://www.piwheels.org/simple

# # -- End of Display-O-Tron HAT section -- #

# Copy everything into the container
COPY . ./
#Make sure scripts in .local are usable:
ENV PATH=/root/.local/bin:$PATH
ENV DBUS_SYSTEM_BUS_ADDRESS=unix:path=/host/run/dbus/system_bus_socket
# Start application
CMD ["bash", "start.sh"]

And the docker-compose:

version: '2.1'
services:
  archerytimer-remote:
    build: .
    network_mode: "host"
    privileged: true

start.sh

#!/bin/bash

# Run one process loop
python src/process.py

# Start resin-wifi-connect
export DBUS_SYSTEM_BUS_ADDRESS=unix:path=/host/run/dbus/system_bus_socket

# 4. Is there an active WiFi connection?
iwgetid -r

if [ $? -eq 0 ]; then
    printf 'Skipping WiFi Connect\n'
else
    printf 'Starting WiFi Connect\n'
    ./wifi-connect
fi


# At this point the WiFi connection has been configured and the device has
# internet - unless the configured WiFi connection is no longer available.

# Start the main application
python src/main.py

and main.py

#!/usr/bin/env python

import time
import subprocess
import process
import requests

from button_controller import ButtonController

timer_ip = None
timer_port = "8090"


class MyListener:
    def __init__(self):
        print("init Zeroconf Listener")

    def remove_service(self, zeroconf, type, name):
        global timer_ip
        print("Service %s removed" % (name,))
        timer_ip = None

    def add_service(self, zeroconf, type, name):
        global timer_ip
        import ipaddress
        info = zeroconf.get_service_info(type, name)
        timer_ip = str(ipaddress.IPv4Address(info.addresses[0]))
        print("Service %s added, service info: %s" % (name, info))

def main():

    from zeroconf import ServiceBrowser, Zeroconf
    zero = Zeroconf()
    listener = MyListener()
    browser = ServiceBrowser(zero, "_archerytimer._tcp.local.", listener)  

    while True:
        # Run one process loop
        try:
            process.main()
            # Sleep to avoid 100% CPU usage
            time.sleep(10)         
        except:
            pass   



if __name__ == "__main__":
    main()

The process.py is the same as the wifi connect:

#!/usr/bin/env python

import subprocess

def main():
    # Get the current SSID
    SSID = None
    try:
        SSID = subprocess.check_output(["iwgetid", "-r"]).strip()
    except subprocess.CalledProcessError:
        # If there is no connection subprocess throws a 'CalledProcessError'
        pass

    # Show status on the LCD display
    if SSID is None:
        print("Not connected")
    else:
        print("SSID: " + str(SSID))

if __name__ == "__main__":
    main()

From the requirementents .txt:

RPi.GPIO
gpiozero
requests
requests-jwt
zeroconf
numpy
pillow

I hope it could help

Nothing stands out with what you’ve posted. Let us know if the examples we provided don’t work for you which might point to issues with your set up or local network.

Hello,
I have found the problem, as I was using wifi connect to test at my home, I have a “guest” wifi that doesn’t have access to the inner network, so it doesn’t receive multicast packages from the local network. I feel stupid and I apologize for wasting your precious time.
I didn’t realize before because to be able to deploy the balena container I attached a usb-ethernet adapter that provides a wired interface, so I was capable of accessing…
That doesn’t answer why the multicast wasn’t received on the wired interface…
Again, sorry for the inconvenience, and thank you for the amazing project and support.
As soon as I publish my projects (they will be under gpl) I will provide some links.
Regards

@gpulido, no problem, thank you for sharing your findings. Yes, do share your project links when you have them! :slight_smile:

That doesn’t answer why the multicast wasn’t received on the wired interface…

Hmm, just a thought: multicast packets may not be bridged between Ethernet and WiFi, especially if the Ethernet cable was straight from the device to a laptop. And even if the Ethernet cable goes to a WiFi router, it depends on router configuration.