Have a small fleet (18 RPi Zero W) that are connected to a wifi network (actually there is 4 different login/psk).
Some RPi lose connection and cannot reconnect, even though a RPi nearby is connected just fine. Usually, if I add a hotspot w/ some other wifi that is currently not showing, it does connect.
What I wanted to do is: - Check the status of the wifi (connected or not); - Somehow reset the wifi if not connect (so it start to looking for connections afresh, similiar to a complete reboot)
Seems not. As I understood, NetworkManager automatically scans already.
But Iām in a situation where sometimes the device is disconnected even when a perfect wifi connection is available and the device have the auth to connect.
These times, always a reboot resolves the issue but I wanted to ārebootā only the connectionā¦ (The reason this problem occurs is not clear yet)
Hi,
What balenaOS version are you using on those RPis? Are you using any external WiFi dongle?
You should be able to restart the NetworkManager as a whole with the following dbus command:
In my application, I spin off a python thread that is entirely devoted to waiting a set interval, checking for internet below and on failure, running the above (which if run with os.system waits on the command) then checking again to confirm it works and logging it for statistics.
My interval is every 20 seconds and my devices will āsoft-resetā their network manager 20 to several hundred times a day. I suspect this has something to do with the adapter I am using.
It seems to remind the network manager that it can and should connect to a connection it chooses to stop connecting to. Note that why it stops in the first place remains unknown to me.
Restarting the entire network manager service was never an option for me because it seems to forget that unmanaged devices exist after such a restart.
To access the nmcli in a container you have to follow the instruction in the networking documentation balena has.
Good luck and if you ever find out more about whats going on let me know.
Since I have several possible connections, donāt want to manually set one of the these (and check if it is working or not and etc).
What problems do you see w/ āit seems to forget that unmanaged devices exist after such a restartā?
Iām using a logic simular as yours to detect if is connect but thought about using nmcli radio wifi off && sleep 5 && nmcli radio wifi on to reset the connectionsā¦
It makes sense you donāt want to mess with connections if you donāt know which one is active. You could as a last result try to set each up in turn and text between each. In addition, nmcli reports success or failure that you could parse.
As for the problems with restarting it. If you un-manage a device (nmcli dev set wlan9 managed no) then network manage no longer touches that device. If you then restart network manager, that device disappears from nmcli d s and canāt be manged again for usage. This is only relevant if you want to do other things with your adapters like monitor mode.
nmcli radio wifi off sounds like it would work to me, but I have never tried it.
It says that nmcli connection up ifname "$DEVICE" is a valid command and would avoid you having to choose which connection you want to use. You just have to choose what adapter to use, which in your case is probably just wlan0
Also if you just have one adapter, you could always just un-manage and manage it again. That would have to reset network manager.
Let me know what ends up working for you. I would love to learn more in-case we end up deploying on RPi Zwās in the future. (They were second on our short list)
Hi!
Iām really interested in running this python thread but just wonder where to put the code and how to run it? New to BalenaCloud and have issues with wifi connectivity on four different Piās.
Hey @henrik
Are you referring to my mention of this thread here?
If so that is past of our closed source production code base for the company I work for. I would have to seek permission to spin the WiFi connectivity checking and control thread off to a separate open source project. To be clear I think I could get permission to do so, but I want to make sure that it will serve your need before doing so.
I think I can describe its function as it isnāt any secret.
This thread is designed to be spun off the main process and report and maintain the connectivity status.
It does several things to accomplish that.
First it writes a new network connection with nmcli that is adapter specific and sets it up:
As you can see it relies on a object framework that fills in the important settings.
Then it can use that connection to accomplish some of the more useful debugging steps below.
The connectivity loop is probably more what your after:
Over and over it checks for internet with
def soft_reset(env):
print_command('ip route flush 0/0')
if not env.current_connection:
if env['in_the_lab']:
print_command('nmcli c up dm_debug')
else:
print_command(f'nmcli c up {env["primary_con_name"]}')
else:
env.current_connection.up()
status = internet()
if status:
logging.info(f"Soft-reset complete, you should be able to see this")
else:
logging.info(f"Soft-reset failed")
return status
That solves the bulk of solvable network issues in my experience. However, we ship with multiple adapters and if there is anything Network manager sucks at it is having multiple wifi adapters and only one connection settings.
So the next step would be to rotate adapters by managing and un-managing them in NM.
This is probably not relevant to most people.
@henrik
What are your needs from this python thread?
What kind of problems are you looking to solve?
What would be the most useful way to me to share this code? My preferance would be a simple python module that you can pull in from github and launch with a simple Thread() command.
As additional information NetworkManager itself does have a connectivity built-in check and it is really easy to get the overall connectivity state of the device by using it.
You may check nmcli gen and for seeing the precise value nmcli -t -f CONNECTIVITY gen.
Alternatively the same could also be done using the D-Bus API (e.g. with a library like python-networkmanager for example). My personal preferred way is by using the API, since I do not have to install nmcli in the container, but both are perfectly fine.
@henrik
I would be interested in your experience solving these issues because while our fleet is much more healthy than it used to be. There is still a % of devices that only connect during the night due to Wifi network congestion issues (as best we can tell).
Anything you learn while solving this might help me one day. So I would love to hear about how your doing.
@majorz
The D-Bus API seems the way to go if you really want to unlock the full configuration options of Network Manager. But I picked nmcli for because I could run and test the commands by hand and I have never used the D-Bus for anything and didnāt have time to learn how.
One thing I notice is that nmcli has poor error handling at best, and using it with a python script means at best errors get sent to the logs un-parsed. If I were to do the project again I would use the D-bus.
I didnāt know the nmcli gen commands. I will incorporate them into my stuff the next time I take a stab at further solving this issue on our fleet.