Network Manager Device Issues

System Details

Problem

We are using a Raspberry Pi 3 and a USB WiFi adapter to connect our devices to the internet as well as to some internal networks. Here is an interface breakdown:

wlan0 (internal to the RPI3): hosts a hotspot
wlan1 (WiFi adapter): connects to the internet
eth0 (internal to the RPI3): connects to an internal network

We use a Docker Compose solution. When one of our processes starts up, the first thing it does is create the hotspot on wlan0. The hotspot is tied to wlan0 via the connection configuration (see below). The process will enter a loop of starting the hotspot if it detects the hotspot failed to activate. This is why in one of the network manager logs below you will see a lot of activation attempts for the hotspot. When the USB WiFi adapter is not plugged in, everything works as expected. However, when the WiFi adapter is plugged in before boot, we get into situations where wlan0 becomes “unusable”.

Meaning of unusable:

  • When using nmcli to attach a connection to it “nmcli c u hotspot”
    • Error: Connection activation failed: No suitable device found for this connection.

  • It shows up as “disconnected” via “nmcli d”
  • When I issue “nmcli d c” to “connect” the wlan0 interface:
    • Error: Failed to add/activate new connection: A ‘wireless’ setting is required if no AP path was given.

    • If there is another connection available beside the hotspot, it will choose the other connection and not the hotspot. It works fine in this context.
  • The logs keep showing this error only when I reproduce the problem - otherwise this doesn’t show:
    • device (wlan0): set-hw-addr: new MAC address C2:33:52:A8:5A:C8 not successfully set (scanning)

If the WiFi adapter is plugged in after the RPI3 boots, things work fine. This is only a problem if the WiFi adapter is plugged in when the RPI3 boots.

We’ve been using this solution for months and never noticed any issues. Based on my debugging, I think these are the important factors to consider:

  • The WiFi adapter being plugged in at device boot seems to somehow cause this problem. This is the only way I’ve been able to reproduce the issue.
  • The issue seems specific to wlan0 and the hotspot connection. In the error state, other connections work on wlan0 - only the hotspot fails.

I’m looking for any insight into this problem. We want things to work the way we have them now and want to avoid changing which interface we attach our connections to if possible. I may attempt to set an autoconnect priority as well as define a set number of retries on the hotspot connection, but based on my debugging I don’t have faith that those changes will get to the root of the problem.

If I can provide any more details, please let me know.

Relevant Data / Logs

Hotspot Configuration

hotspotConnection := map[string]dbus.Variant{
		"id":             dbus.MakeVariant("hotspot"),
		"uuid":           dbus.MakeVariant('some_uuid'),
		"type":           dbus.MakeVariant("802-11-wireless"),
		"autoconnect":    dbus.MakeVariant(true),
		"interface-name": dbus.MakeVariant("wlan0"),
	}

	hotspotWifi := map[string]dbus.Variant{
		"band": dbus.MakeVariant("bg"),
		"mode": dbus.MakeVariant("ap"),
		"ssid": dbus.MakeVariant([]byte("some_ssid")),
	}

	hotspotWifiSecurity := map[string]dbus.Variant{
		"key-mgmt": dbus.MakeVariant("wpa-psk"),
		"psk":      dbus.MakeVariant("some_psk"),
	}

	hotspotIPV4 := map[string]dbus.Variant{
		"method": dbus.MakeVariant("shared"),
		"address-data": dbus.MakeVariant([]map[string]dbus.Variant{
			{
				"address": dbus.MakeVariant(settings.ip),
				"prefix":  dbus.MakeVariant(settings.ipPrefix),
			},
		}),
	}

	hotspotIPV6 := map[string]dbus.Variant{
		"method": dbus.MakeVariant("auto"),
	}

	return networkmanager.ConnectionConfiguration{
		"connection":               hotspotConnection,
		"802-11-wireless":          hotspotWifi,
		"802-11-wireless-security": hotspotWifiSecurity,
		"ipv4": hotspotIPV4,
		"ipv6": hotspotIPV6,
	}

journalctl -u NetworkManager Logs

These were a bit annoying to grab off the device. Hopefully they help.
nmoutput-boot (2).log (30.0 KB)
nmoutput (2).log (614.4 KB)

Misc Logs

error dmesg | grep wlan0

root@eeed17e:~# dmesg | grep wlan0
[    9.223128] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[    9.825740] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[    9.932582] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   11.569888] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   12.925996] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[  357.116488] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[  672.113766] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready

success dmesg | grep wlan0

[    9.471633] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[    9.514793] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   10.039566] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   10.382145] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   11.442040] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   12.320801] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[   36.724186] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   37.315847] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   37.643061] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready

working ifconfig dump of wlan0

root@9885f88:~# ifconfig wlan0
wlan0     Link encap:Ethernet  HWaddr B8:27:EB:44:AE:69
          inet addr:10.0.0.1  Bcast:10.0.0.255  Mask:255.255.255.0
          inet6 addr: fe80::e6c:fd74:3c42:626f/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:7 overruns:0 frame:0
          TX packets:140 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:24067 (23.5 KiB)

error state wlan0

wlan0     Link encap:Ethernet  HWaddr 50:3E:AA:31:61:4C
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

driver

root@eeed17e:~# readlink /sys/class/net/wlan0/device/driver
../../../../../../../../bus/sdio/drivers/brcmfmac

After speaking with some folks at Resin, it seems the issue seems to be the renaming of the devices by Linux. The interface names wlan0 and wlan1 are not consistent, which ultimately is causing the problems I am seeing.

Apparently this is the fix for the issue which will allow developers to ensure naming consistency for devices.

One concern that isn’t clear is why this only happens in Resin OS versions 2.15 and above. I’ve done on every OS from 2.12 up to 2.15 - everything works as you would expect, consistently, until you get to 2.15. At this point, things fail consistently.

The tentative fix I’ve employed is to simply make sure I don’t have my wlan1 device plugged into the RPI3 at boot. This ensures wlan0 is properly named. After the device boots, I can plug the wlan1 device into the RPI3 and things will work as expected (until a reboot).

I think you can do the following:

Instead of creating the hotspot using connection.interface-name, you may use the persistent mac address of the device - the 802-11-wireless.mac-address property of the connection profile.

You can iterate over the devices and see what driver the device is using and target this particular device. If you are using the AddAndActivateConnection method of the NetworkManager D-Bus API, you will be passing the device as an argument, and then the mac-address will be automatically filled in, so you do not have to specify it manually. It will not specify the interface-name.

Please let me know if you need more information on this.

Thanks - I’m currently going to make sure that I can swap the interfaces for the connections (WiFi and hotspot) in the error scenario. If I can, then an approach like what you described should work. As long as we have two functioning interfaces to work with, we should be able to put something together. However, if the issue I’m seeing is making one of the interfaces totally useless, I will not be able to make a workaround as both interfaces are required for my use-cases.

I’ll update this thread with my findings.

I have verified that swapping the interfaces the connections are associated with solves the problem. My hotspot can run on wlan1 in the error case and the WiFi network can run on wlan0. This makes sense considering that the connections are being run on the same hardware, it’s just the interface name from the OS perspective is different.

Would you recommend I just execute the readlink command on the relevant device paths to determine the driver? I am indeed using the Network Manager DBUS API. I would execute some command to identify which piece of hardware each wlan* interface name refers to. Once I have this understanding, I can associate the connections with the proper interface name.

You can use the Driver property of the Device interface instead of using the lower level tools: https://developer.gnome.org/NetworkManager/stable/gdbus-org.freedesktop.NetworkManager.Device.html#gdbus-property-org-freedesktop-NetworkManager-Device.Driver

You may retrieve the device list through the GetDevices method: https://developer.gnome.org/NetworkManager/stable/gdbus-org.freedesktop.NetworkManager.html#gdbus-method-org-freedesktop-NetworkManager.GetDevices

All that you need can be achieved only through the NetworkManager D-Bus API. Please let me know if you encounter any difficulties or have any questions regarding the API.

Nice - I’m very familiar with using the DBUS API for NM, so I should be good. Thanks for the insight!

I just wanted to update the thread with the final resolution. We ended up waiting until Balena supported udev rules and then implemented them accordingly. Since then, we’ve had no issues of this nature. The decision to wait for Balena came down to if we wanted to implement a “hacky” solution to solve the problem or just wait until a cleaner path became available. Our fix in the meantime was to just control the order of the devices being plugged in. We’d let the device boot and then after a few seconds plug in the USB adapters.

"SUBSYSTEMS==\"usb\", SUBSYSTEM==\"net\", ACTION==\"add\", ATTRS{removable}==\"removable\", ATTRS{manufacturer}==\"Realtek\", ATTRS{product}==\"802.11n NIC\", NAME=\"usb-wireless\"\n"

A similar rule was created for the “usb-ethernet” device.