EADDRNOTAVAIL when trying to bind socket after device turns back on

One of the functions of our managed Pis is to receive GPS information regularly, for which we are using Node’s dgram module.

We are using Pis in an environment which sees them power cycled several times throughout a normal day. When turned back on, device logs show that the ‘main’ service is already running:

Service is already running 'main sha256:de3cf67b9a9e5bcf71c6fae7341b731fd1bacdc3c3418b0d48ff6220995360cd'

From there, it appears that our container application is started. When it attempts to bind a socket with a static IP , we receive the error from main: Error: bind EADDRNOTAVAIL, most likely as it was already bound by the ‘main’ that was already running.

How can I overcome this? Normally I’d register a socket.close() to occur on process.exit but with power being removed to the Pis that doesn’t get called. On restart it seems through balenaOS that ‘main’ isn’t being restarted so much as reconnected to. Using the balenaCloud interface, a simple restart of the ‘main’ service fixes this problem completely, but this isn’t a practical solution for how many devices we have or how often they will be power cycled.

Is there a way to force a restart of main every time the device turns back on? Or is there another solution completely?

Hi Michael,

Thank you for raising this issue. To clarify, the message “Service is already running” is not an error, simply an info message that the balena supervisor has skipped the step of starting the ‘main’ container/service because balenaEngine had already started it on boot. By the way, to confirm, ‘main’ is the name of your application’s container/service, and the log messages starting with ‘main’ are log messages printed by your application’s container/service. There is only ever one instance of the ‘main’ container running at a time. I believe that the “Error: bind EADDRNOTAVAIL” message is an application issue. As you’ve pointed out that manually restarting the ‘main’ container/service fixes the problem, consider the possibility of a race condition: that immediately after boot, a resource like a port number being bound to isn’t available yet (perhaps a local application server hasn’t started yet?). You could investigate this possibility by adding a few seconds’ sleep before the bind call, or perhaps using socket flags like SO_REUSEADDR and SO_REUSEPORT. I also suggest having a look at an article I found in a web search: https://idea.popcount.org/2014-04-03-bind-before-connect/

Let us know what you find / think given the above. If you still think that there is an issue with the balenaOS or supervisor, please get back to us and we will investigate it further.

Kind regards,
Paulo

Thank you Paulo, that information is very helpful, those socket flags were indeed what was missing.

For reference, using dgram in Node.js, the options flag is reuseAddr:

const socket = dgram.createSocket({type: 'udp4', reuseAddr: true})