Unreliable Networking on Raspberry Pi W

Hi all,

I’ve been using balenaOS for some time now and for the most part it’s been great, but I’ve had nothing but issues in terms of networking. Originally, I had each Raspberry Pi Zero W connected to the Wi-Fi where I would experience random internet problems where I was unable to connect to the devices via the Balena Dashboard. They would show as offline, heartbeat only, no VPN connection, et cetera. Thinking that this was a Wi-Fi problem, I purchased some micro USB to Ethernet adapters, but the same problems ensued. There are 4 devices total at this location, all Raspberry Pi Zero W’s, all connected via Ethernet to an un-managed network switch, then from the switch to the modem/router. This is on a business class internet connection that very frequently has outages on other devices and there is no outbound firewall rules or similar device between the devices and the outside internet. I’ve ensured that all ports and hosts are available from the “Network Requirements” article in the balenaOS documentation. I’ve been able to connect to them for short amounts of time (<1 minute) before they disconnect. Sometimes shorter, sometimes longer, I haven’t noticed any patterns. Any help or pointers would be greatly appreciated.

Also, I originally contacted Balena themselves over this issue who directed me to this forum. They mentioned that including a traceroute could be useful in solving this problem, here are the results before it disconnected:

root@f399e03:~# traceroute api.balena-cloud[.]com
traceroute to api.balena-cloud[.]com (52.202.238.221), 30 hops max, 38 byte packets
1 _gateway (192.168.1.1) 1.191 ms 0.740 ms 0.972 ms
2 * * *
3 B4305.BFLONY-LCR-21.verizon-gni[.]net (100.41.223.230) 17.155 ms 16.015 ms B4305.BFLONY-LCR-22.verizon-gni[.]net (100.41.5.244) 12.687 ms
4 * * *
5 * * *
6 0.ae16.GW14.IAD8.ALTER[.]NET (140.222.226.37) 19.864 ms 20.501 ms 0.ae15.GW14.IAD8.ALTER[.]NET (140.222.226.31) 23.352 ms
7 204.148.170.66 (204.148.170.66) 23.905 ms 21.319 ms 21.450 ms
8 * * *
9 * * *
10 52.93.28.98 (52.93.28.98) 24.710 ms 25.448 ms 52.93.28.102 (52.93.28.102) 25.046 ms
11 * * *
12 * * *
13 * * *
14 * * *
15 * * *
16 * * *
17 * * *
18 * * *
19 * * *
20 * * *
21 * * *
22 * * *
23 * * *
24 * *

Also, please note in the hostnames above, I have added the [.] myself. I tried posting it and it said new users could only post 5 links, so I’ve put the brackets to allow me to post this still.

Hi

Welcome to the forums!

  • Can you enable persistent logging on the devices, so that you can access the logs later, and across boots.
  • do you see anything suspicious under dmesg -wH
  • Also take a look at our HostOS masterclass, and especially the section on NetworkManager logs. If you can share verbose logs, it would help immensely to figure what’s going on.
  • Also, as I understand from your message, this is all at one site. You haven’t been able to recreate these issues locally right? The fact that you are seeing issues with WiFi as well as wired is strange. Can you tell us what’s the power situation like? What power supply are you using on all the devices? Is it same for the devices that are showing issues? Do you have more hardware connected to the Pi Zeros that might be consuming current?

Yes, this was already enabled as part of my own troubleshooting.

Yes, it is this message repeated about every 5 minutes.

   [Feb22 14:14] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
   [  +0.000245] brcmfmac: brcmf_cfg80211_set_power_mgmt: power save disable

I’ve turned this on using nmcli general logging level DEBUG domain ALL, but I’m confused on how to actually read that log now. Is it journalctl or is that from the container only?

I haven’t tried to recreate it locally yet (I can try this soon on Wi-Fi, but I don’t have another spare Ethernet adapter). They’re plugged into a Raspberry Pi power supply that’s 5V 2.5A which should be plenty. Besides power, Ethernet is plugged in as well as the HDMI port to a TV.

Hello,

  • Can you please confirm the current balenaOS version you are using?
  • I’ve turned this on using nmcli general logging level DEBUG domain ALL, but I’m confused on how to actually read that log now. Is it journalctl or is that from the container only?

You should now have more verbose output when using journalctl. Can you please see if there are is anything peculiar?

  • Can you please provide more details about those usb ethernet adapters?

Thanks

Yes, it’s balenaOS 2.54.2+rev1.

I was only able to connect for a few seconds before it quit, but taking a scroll through it, it seems to all be openvpn and kernel logs, no warnings or errors. I can keep trying, but I think the comment below helps more towards solving the situation.

Yes, here’s the link to it. Upon taking a look at the Amazon page again, I saw some reviews about duplicate MAC addresses and confirming in the balenaOS panel online, this seems to be true and is perhaps causing interference. Is there a way to manually set the MAC address in balenaOS and would something like that help in this situation?

From your original post:

This is on a business class internet connection that very frequently has outages

Did you mean it very infrequently has outages?

Sorry to have to ask this, but I just want to make sure.

Yes, sorry that was a typo on my end. Very infrequent outages, yes.

Ok, I mean, I was 99% sure that’s what you meant, but just wanted to verify.

Can you tunnel into the device locally and post the logs from journalctl? Should should be able to tunnel with balena ssh [uuid].local, where uuid is the short uuid (7 characters).

Yes, I was able to, but the log is very lengthy, so I’ve posted it here.

I looked in the logs you posted and there’s nothing obvious to me. However, it doesn’t look like NetworkManager is in debug mode either. Could you try running nmcli general logging level DEBUG domain ALL in the host OS, wait a bit, and get the logs again?

From your messages looks like you might be running NetworkManager inside a container? Is this the case?