Balena Device only showing up on one router

I have 3 routers. When I plug my Intel NUC into Router #1, it works fine. When I plug it into router #2 or #3 however, it doesn’t work. I have other Intel NUC’s where Router 2/3 work fine. So this doesn’t appear to be an issue with my router.

Is there anything I can do to get to the bottom of this?

Hey @bgold

What have you tried already? What are the symptoms you’re seeing when you say it doesn’t work? There are lot of questions, so if you can give any more information that would be helpful. I’d be checking things like: is the device getting a network link at the appropriate speed, am I using a known working patch cable, is the device getting assigned an IP address if set to DHCP, if set to static is that IP valid in the range of the other routers, if the device is getting an IP is it able to ping the router, does DNS work, does pinging an IP on the internet work? Etc. etc.

Hopefully this gives you some ideas of things to try. You can check the network requirements of balenaCloud here: https://www.balena.io/docs/faq/troubleshooting/faq/#what-network-ports-are-required

I have the following:

  • Modem
  • Balena Intel NUC 1
  • Balena Intel NUC 2
  • Router 1
  • Router 2
  • Router 3

Things I have tried (Note they are using the same cables and all routers were reset to factory settings so no static settings, etc.):

  • Balena Intel NUC 1 + router 1 + modem = I see it online on Balena cloud and in my routers DHCP list

  • Balena Intel NUC 1 + router 2 + modem = I see it online on Balena cloud and in my routers DHCP list

  • Balena Intel NUC 1 + router 3 + modem = I see it online on Balena cloud and in my routers DHCP list

  • Balena Intel NUC 2 + router 1 + modem = I see it online on Balena cloud and obviously in my routers DHCP list

  • Balena Intel NUC 2 + router 2 + modem = Not online and doesn’t appear in DHCP list

  • Balena Intel NUC 2 + router 3 + modem = Not online and doesn’t appear in DHCP list

It’s weird that one Intel NUC box would show up on the router and one wouldn’t. Seeing as they are the same exact image.

The following ccan’t be answered because the device never shows up in DHCP:

  • if the device is getting an IP is it able to ping the router
  • does DNS work
  • does pinging an IP on the internet work

Hi there,

Just wanted to check if you verified if lights on the RJ45 connectors indicate that the link is up between the NUC and the router.

Also, if you can access the device shell (if it’s a dev OS image, you should be able to connect the keyboard and monitor to the NUC and login as root), it might be helpful to check NetworkManager logs in jounralctl:

journalctl -u NetworkManager

It might display information about connection status.

From what you describe, it sounds like something might be wrong with the actual physical link between the router and NUC.
Can you also check the output of the

ip link show

to verify the status of the ethernet interface on your NUC?

Hi @roman-mazur @chrisys ,

The lights on the RJ45 connectors indicate that the link is up. I will have to create a dev OS image to check the terminal commands. This is the last thing that shows up on the monitor:

EDIT:

After getting a dev image up, here are the following commands:

journalctl -u NetworkManager

and

ip link show

Hi,
Thanks for providing the logs.
From ip show link output it’s clear you have the interface up, so everything seems to be good with the connector.
And in NetworkManager logs we can see a line

dhcp4 (eno1): activation: beginning transaction (no timeout)

So it looks like the device is behaving as expected: is starts DHCP negotiation. However, there are no any other log records on further DHCP activity. Maybe there is some configuration on the router that prevents it from issuing an IP address for a particular MAC address - it’s the best guess I can make at the moment.

If it’s possible, I would check some debug-level logs on the router side to see if the initial DHCP request is delivered to the router (and it should be delivered, from the device logs).

What specific Intel NUC and router models are you using?

Hi @roman-mazur,

Maybe there is some configuration on the router that prevents it from issuing an IP address for a particular MAC address

This is unfortunately not the case. Every router works if it is the first router that the Intel NUC connects to on first provision. I re-flashed one directly connected to a router that didn’t work and it shows up.

I am using an Intel NUC7i5BNK Mini PC NUC Kit (INBNUC7I5BNK)

I should note that this is a new issue. This was not always the case with these routers. I’ve successfully used them before. This might be an issue with an OS update.

Steps that re-create issue:

  • Flash Intel NUC on a router successfully
  • Pull power plug directly
  • Plug in Intel NUC to a different router.
  • It doesn’t show up

I even brought a 4th router in to test this today and the same thing happened.

Steps that work:

  • Flash Intel NUC on a router that didn’t work
  • It shows up

Hi there,

Thanks for the information. We’ll follow those steps and try to reproduce the issue using a few NUCs and two routers.
We’ll share an update once we’ve had the chance to test it.

@bgold, I just wanted to explore a bit further any log messages in NetworkManager related to DHCP. Roman had pointed out that:

And in NetworkManager logs we can see a line
dhcp4 (eno1): activation: beginning transaction (no timeout)

So it looks like the device is behaving as expected: is starts DHCP negotiation. However, there are no any other log records on further DHCP activity.

But the logs were a screenshot, at the bottom of which it says “lines 28-93/93”, so I wonder if there were additional DHCP log messages in the following screens. Also, it might be worth it to check whether your routers have any logs that could be inspected as well, perhaps searching for the MAC address of the NUC’s network interface (looks like 94:c6:91:a0:82:46)…

If we are assuming it is a DHCP issue, and we found that the NUC initiates a DHCP request but fails to complete it, it would be interesting to sniff the network and check where DHCP handshake packets are missing – if the router fails to reply, or if the NUC fails to process a reply: http://www.eventhelix.com/Realtimemantra/Networking/dhcp-flow/dhcp-sequence-diagram.pdf

Sadly, the sniffing setup may not be trivial in this case…

Sadly, the sniffing setup may not be trivial in this case…

Actually, another thought. The NUC that fails with a couple of routers – chances are it would fail with other routers as well… In which case, instead of plugging it to a router, what about plugging it to a laptop setup with internet connection sharing? (docs for Windows, docs for Mac). Just so that a network sniffer could be run on the laptop – something like Wireshark or ngrep, to inspect the DHCP handshake.

It’s just a thought – depending on your inclination and availability to dig into it. :slight_smile:

I’d like to see if you guys can replicate this first on your end before I spend time Wireshark etc. I have a suspicion this might be from an OS Version update. These issues are all stemming from balenaOS 2.45.1+rev2. But I have boxes on balenaOS 2.44.0+rev1 that seem to work as expected.

I’d like to force flash a balenaOS 2.44.0+rev1 on a box to see if I can replicate the problem. Could someone point me to how I can do this?

These issues are all stemming from balenaOS 2.45.1+rev2. But I have boxes on balenaOS 2.44.0+rev1 that seem to work as expected.

This sounds relevant indeed. We’ll let you know if we can replicate it.

I’d like to force flash a balenaOS 2.44.0+rev1 on a box to see if I can replicate the problem. Could someone point me to how I can do this?

It’s not possible to downgrade the OS via the web dashboard, but you could “reflash” it via the usual process, e.g. using balenaEtcher: Get Started with balenaCloud using Intel NUC and Node.js - Balena Documentation

@bgold as there seems to be some issue with DHCP after the device has been connected to the first router, why not try configuring the device with a static IP as a test? If the device works fine with a static IP configuration we know where to concentrate our debugging efforts.

balenaOS 2.44.0+rev1 is actually no longer available, but you could still flash the device again with that version if you still have it downloaded. Otherwise you can go back to the next available previous version (2.41.1+rev1 to see if that exhibits the issue).

One further question; I see that you gave the NUC model above as the NUC7i5BNK. Are both the NUCs this exact same model? By both I mean that in post 4 above you mention NUC 1 works fine and NUC 2 exhibits the issue, you confirmed they are both running the same image but is the hardware the same also?

Hi @chrisys

Configuring the static IP as a test is what I was actually trying to do when I found this issue. Are you saying on Router 1 I should try to configure a static IP or on router 2? The new routers after moving it off of the initial router does not work through static configuration via a Mac Address or non-static.

And yes, they are all the same NUC model. We have 10 NUC’s total split between the office and in the field that are the same exact models. To reiterate, the hardware is exactly the same.

I will try to flash it at 2.41+rev1 this week to see if the same issue comes up. Bummer that 2.44.0+rev1 is not available anymore. I do not have it downloaded.

I mean configure a static IP address on the device itself, if you do it on the the router we’re still going to be using DHCP and so it wouldn’t serve to take that out of the equation. Thanks for clarifying that the models are the same :+1:

I mean configure a static IP address on the device itself

@chrisys Unfortunately I don’t know how to do this. While I know this is for debugging purposes, when we deploy them in the field we also have no knowledge of the local network these will be running on so configuring them on the device wouldn’t solve our issue.

PS I really appreciate the quick responses. Would really like to get this resolved as I know you guys would too!

Yes this would only be for debugging purposes to work out if the issue lies with the DHCP process. We have documentation on setting a static IP address on a device available here: https://www.balena.io/docs/reference/OS/network/2.x/#setting-a-static-ip

@chrisys @pdcastro @ntzovanis @roman-mazur

Hey guys, just following up to see if anyone was able to reproduce this issue.

Hey @bgold ! Did you manage to reproduce the issue using a static IP?

No because @ntzovanis said above that you guys would try to replicate first. Much rather see if you can replicate before I spend time I don’t have debugging a bug.