Problem resolving DNS when device is offline

Hello,
We’re having an issue related to long DNS lookups when our devices are offline.
We have a mosquitto broker that connects to a cloud broker via a bridge. When the devices are offline the connection takes a long time to timeout and it causes mosquitto to drop messages, this is a problem with mosquitto (see https://github.com/eclipse/mosquitto/issues/1530) but we’re still wondering if there would be a way at the OS level to specify a smaller DNS lookup timeout ?
The temporary fix in our case is to use the IP address of our cloud broker instead of this hostname.

Thanks !

2 Likes

Hello @natcl ,

Just wanted to let you know that we’ve forwarded this to our OS team and they’re taking a look at it. We’ll get back to you as soon as we have an update.

Cheers!

1 Like

Hi, thanks for the detailed explanation. I agree that reducing the applications DNS timeout would probably help workaround the bug in mosquitto. Unfortunately the BalenaOS default settings are designed to cover a wide range of applications. In this particular case, an increased local timeout helps reducing the bandwidth on slow cellular networks while in general having no adverse effect in other use cases, obviously this is an exception in your case.
Also, configurability is an architectural topic that has been often discussed inside Balena. The consensus is that we cannot provide settings for everything as that would fragment the product making it more difficult to test, maintain and support and also increase its complexity for our target user profile.
As I see it there are three options:

  1. Keep avoiding the use of DNS by using the IP address as you have described
  2. Build your own custom BalenaOS and modify it with your custom changes. Disadvantages with this approach include having to maintain your changes as BalenaOS versions are released, as well as not being able to update BalenaOS from the dashboard. Manual fleet update mechanisms exist for custom OSes but they are cumbersome to use.
  3. In the near future, BalenaOS will introduce the concept of host extensions. These are basically containers that get overlayed over the filesystem at boot. Such a container would allow you to configure the DNS timeout to a specific value. The responsibility to test and maintain changes introduced in this way will be with you, the user, and BalenaOS will not be tested with the change. Support agents may also ask you to reproduce any reported problem without the host extension.

I have opened a product ticket in https://github.com/balena-io/balena-io/issues/2305 and we will update it once a solution via host extensions is released.

Hello,

Thanks a lot for taking the time to look at this.
In the meantime I will stick with option 1 which seems to work, luckily it’s a static IP that never changes so that works for now. I’m also following up on the issue on the mosquitto repository as this should be fixed as it’s less than ideal.

Glad to see we’ll be able to override more things at the OS level, will surely come in handy for edge cases like this.