Reliable and updatable access point

Hi Zahari,

Thank you for your response.
I saw that you’ve tried the AP+STA mode before in another topic. I was happy to find that out, but after some investigation I saw that this has some downsides. Like only being able connect to Wi-Fi networks on the same channel. And I’m not really sure if it’s reliable enough.

I’ve put some work into creating a native dbus library in Node.js, which works great. But thanks for the heads-up about the libnm library.

If you’ve some suggestions, I’m happy to hear them!
The next challenge is determining if the device is online or not in a reliable way, so that the Access Point can be started when the device is offline.

I’ll keep posting updates about my progress.

Awesome, good luck!

1 Like

Thanks!

Because I don’t want to be misinformed, is it true that using AP+STA can cause some problems? Because I’ve read that it can cause some problems like different channels. I don’t know how good you’ve tested it, but I’d like to know if it’s stable and can be used in any situation. So if it still can connect to any 2.4GHz and 5GHz network. Because it’s obviously the best solution to have the AP+STA on the onboard Wi-Fi chip all of the time!

I have not done much testing yet. I initially ignored the AP+STA mode mainly because it was a known issue that it cannot work with NetworkManager, until I was able to run it successfully recently.

AP+STA could have the issues with the channels indeed from what I have read as well. I have not looked at the actual implementation of the mode inside the kernel. My guess is that this would depend on the capabilities of the particular chip/driver/firmware. Since the RPi 4 has a relatively newer chip probably it would behave better and may not have those limitations. With a fairly recent Atheros chip I was doing my testing with on my laptop I did not run into the channel issue at all.

Hereby an update with my progress:

Device online/offline
I’ve written a function that checks if the device is online or not. It executes the following steps:

  1. Check if device is in AP mode. If true, device is offline. If false, continue
  2. Check device state, if disconnected, device is offline. If not, continue
  3. If device state is not ‘connected’, wait for this state
  4. Ping a couple of hosts (like Google’s DNS / CloudFlare’s DNS, both IPv4 and IPv6). If one ping succeeded, device is online. If not, continue
  5. If ping didn’t succeed (could be firewall issues), execute a HTTP(s) ping to our own server. If completed, device is offline. Else device is offline.

If this check determines that the device is offline, it’ll scan all Wi-Fi networks and saved it in a variable. Then it starts the AP.

Connect to configured access points
I’ve also created a function that tries to connect to networks that are already configured. This function does the following:

  1. Check if device is online (see above). If it is, return that it’s connected. If not, continue
  2. Get all configured connections, filtered by Wi-Fi connections. If none found, stop. If not, continue
  3. Scan Access Points
  4. If an access point is found that has a configured connection (based by SSID) OR if a config is found with hidden: true, continue. Else, stop
  5. Try connecting to configured access points that are found (or hidden). If succeeded, awesome. If not, return that it’s not awesome.

After this function, based on the response, the access point can be started again.
I had to create this function, because the built-in function of NetworkManager didn’t work as expected. If I activate the connection "/" on a specific device, according to the documentation, it should automatically connect to the best configuration found. But, according to NetworkManager, the best configuration is the AP configuration :sweat_smile:, even with autoconnect: false.

I’ve also improved my NetworkManager library. The scan function handles the errors well. And when activating a connection, the library waits for the device state to change to connected. Or failed, if it fails, and throws an error. So you’re certain that when activating a network, it’s really activated and not failed connecting.

If someone has any suggestions about the functions I’ve created, please let me know!
I’m planning on publishing the NetworkManager library (when it’s mature enough), so more people can work on it and can improve it.

Has anyone tried to create an access point with NetworkManager and, when you connect to the AP, a captive portal shows up? It’d be really nice if no Wi-Fi is set-up on the device yet, and you connect to the AP, a captive portal would show up with my custom UI to connect to a Wi-Fi network. But after some googling, I really don’t know where to start, because most use hostapd with nodogsplash. But in the case of balenaOS, the situation is really different. So I’d like to know if anyone has tried it and if it’s a success or not.

Added question
Also, is it possible to change the DNS settings while it’s in AP mode, so that every request redirects to the router IP, so the client will always get the Wi-Fi connection UI?

@bversluijs the workflow you describe sounds exactly like what wifi-connect does, but I know that has been mentioned a few times in this thread, so just trying to understand what wifi-connect is missing in that regard?

For the question on DNS, I believe you should be able to define DNS settings on the AP connection file when you create it, not sure if you can change that on the fly though. I think @majorz might be able to give you more insight on that front

1 Like

You need to start separately dnsmasq. It will take care of both DHCP assignment and DNS redirecting for the captive portal to appear. Here are the arguments needed: https://github.com/balena-io/wifi-connect/blob/master/src/dnsmasq.rs.

Please note that if you have enabled Internet sharing on the access point through NetworkManager this won’t work as NetworkManager starts its own dnsmasq instance and add iptables rules (this is the method=“shared” I am talking about here). For the captive portal functionality to work you need to create the access point with the manual method and assign a gateway IP in the connection profile, e.g. with 192.168.42.1 address.

You can start WiFi Connect and while its access point is running you can check the created NetworkManager profile to see how it is configured.

1 Like

Hi,

Thanks for your detailed explanation. A lot of my inspiration comes from wifi-connect, so it’s fantastic that the project exists. I’m not using it, because I’m not familiar with Rust and I’m not planning to get familiar with it at the moment. And because I’m programming most of my code in Node.js and Golang, it’s always fun and interesting to create those projects in another language, in this case Node.js.
Thereby, probably some of the requirements are different from wifi-connect and I can fix bugs myself, because I understand the language :slight_smile:. And the rest of my code runs in the same Node.js project (which works together with the AP), so it saves me from adding another container. But, like I said, wifi-connect is really helpful and I don’t have any issues with it!

Regarding the dnsmasq. Thanks for leading me to the right configuration. I’ve searched in the wifi-connect project, but I couldn’t find where to start, because I don’t understand Rust. So it’s really helpful!

If I understand correctly, I’ve to have a container with dnsmasq running, which is configured with those arguments, and set the NetworkManager configuration file to ipv4.method = manual and the gateway IP to the same IP as the dnsmasq and off we go? I’ll try that asap!

It’d take me days to figure this out, so it’s really helpful of you guys to respond this quickly on the forum!

Right, so when you create the AP profile use the manual method and set a static IP address (e.g. 192.168.42.1). No need to define a gateway in the NM profile as the AP itself will be the gateway - just set a normal static IP (e.g. address1=192.168.42.1/24, and NOT address1=192.168.42.1/24,192.168.42.1). Then you use the same gateway address again twice in the command line parameters of dnsmasq (in the source code I referred to you can replace the {} with the actual values - Rust string formatting is similar to the Python one).

1 Like

Thanks!

And I’ve to set network_mode: host in my docker-compose.yml file for that container, right? Because else, wlan0 isn’t found inside the container probably.

Hi @bversluijs – yes, that should be the case; that’s how we’re setting network_mode in balena-dash, which uses wifi-connect to accomplish the same thing you’re building.

All the best,
Hugh

1 Like

To have dnsmasq work in the container you will also need:

        cap_add:
            - NET_ADMIN
1 Like

I’ll try that out, thanks guys!

I haven’t tried the dnsmasq out yet, but I’ve another question.
Obviously, a container only needs the minimum privileges it requires to run.

I have been busy with checking the connected clients to the Access Point, but in order to know that, I’ve to use iw or arp, because NetworkManager doesn’t support it. But in order for these commands to work, I’ve to have access to the real wlan0 inside the container. This can be done by using:

network_mode: host

But does this come with some security vulnerabilities? Or some other side-effects? Because I want everything to be as secure as possible. Another alternative for me is checking all connected sockets to my webserver, but this has some downsides as well.

Anybody that has any suggestions?

Hi,

1 Like

Hi,

there are no particular security vulnerabilities, it just gives you more control. You need to be aware that when you use network_mode: host your application behaves as if it was running on the host device. This means your container will have access to all the host’s network interfaces and in combination with the NET_ADMIN cap it will able to manipulate them. Just be sure to only operate on your AP interface and not tamper with others as you will have the power to cut yourself out of the device.

1 Like

Ah thanks for the information!
My application only talks to the wlan0 and get’s information for eth0, but all via the NetworkManager D-Bus API. So I’ll not tamper with the VPN or Balena Engine interface. But it’s definitely something to keep in mind. At first, I won’t add the NET_ADMIN cap.

There isn’t a way to only add wlan0 to the container, is there? I couldn’t figure it out myself so far.

No, that isn’t something that’s exposed by the engine unfortunately.

1 Like

Another simple question:
Is it possible to change some DNS settings, like resolving a domain name?
We’d like to make it as easy as possible for our customers to connect to the Access Point. It would be nice if the user can connect to the Access Point and go to, for example: superaccesspoint.com, and it resolves to the Access Point IP (10.42.0.1). This way, when connected, you’ve to go to an URL instead of an IP address. When you’re not connected to the AP and you go to this URL, this URL can show some instructions for example on how to connect.

Thanks!