USB controller problem during higher heat inside of control box

Hello,
we are using balenaFins for an IOT project and have some issues with the network connection and the USB controller.

Description of environment:
The balenaFins are placed in control boxes (stainless steel) and are working as controller and logging units for sensors. In the control box there are as well power supplies and electrical wirings.
The balenaFin case has an open lid and is connected to a din rail mount. It is powered with a 24V power supply.

Os:

  • Raspbian (image form the balena website)

Hardware:

  • balenaFin version 1.1
  • RPi compute module 3+ lite

Connections:

  • 3 DI inputs for power supply status
  • 8 DO relay controls
  • 2x USB serial converters
  • 1x network interface

Problem description
During higher temperatures we are loosing the network connection to the device. At the moment we have no temp probe inside the control box, but when I checked yesterday the inside temperature of the cage, it was about 42°C (Outside temperature 29°C).

After I opened up the control box and it cooled down to the outside temperature, the balenaFin started to work again. Closing the control box caused again problems. Later in the evening, the balenaFin was running fine until the next late morning (09:58 am). Since then, it is not responding any more. The last reported temperature was 60°C.

Troubleshooting steps done
*Changed the balenaFin board and the compute module -> same results

I have saved the system logs form the balenaFin, it looks like the USB controller stopped working and devices get reattached.

Extra Info
We have also installed a “normal” RPi 3B+, for testing, it is working without any problems and shows up temperature above 66°C. It does the same function as the balenaFin and is located within the same control box.

Questions
Has anyone the same problems as we have?
Is there some recommendation for us to do?
Does anyone has as well problems with USB controller?

Thank all of you in advance for any kind of input.

Hi @Frorh , thanks for bringing this up. I have a couple of questions that would help me understand better what’s happening on your devices:

  1. Are you using our maintained version of raspbian for the balenaFin? ( https://www.balena.io/fin/1.1/docs/downloads/ )
  2. Are you using an external antenna, since you are enclosing the device in a metal cage? this might explain why when you open the lid it starts working again
  3. Can you please share the system logs with us? you can also do it privately sending an email to fin@balena.io mentioning this forum thread if you are worried they might contain sensitive information

Best regards,

Carlo

Hello @curcuz, thank you very much.

Please find my answers below:

  1. Yes, we are using your maintained version for the balenaFin.
  2. The balenaFin is connected to a cell router via ethernet.
  3. I will mail the system log.

I will go today to the location of the control box and install a temperature probe inside to compare the inside and outside temperature . I hope to get more information about the behaviour.

Next, I have discovered that the balenFin started responding to ICMP requests yesterday late in the afternoon. Unfortunately the device is in an in between state and I am not able to access it via SSH ( I will also collect this logs and send them to you).

Best Regards,

Frorh

Hi @Frorh , thanks, looking forward for the logs!

Hello @curcuz, I have sent the logs. This morning I have installed a temperature sensor within the control box. At the moment we have a temperature of 47°C inside the panel, the outside temperature is 27 degree.
The balenaFin was able to start (I have seen the DHCP request) but after that I was not able to access it.
I have brought the balenaFin inside, there it powered up and I was able to get the system logs. I will do some further more testing with the balenaFins and will keep you informed.

I am looking forward to your feedback about my logs.

Best Regards,

Frorh

Hey,

We had thermal problems with USB. In our case, I can’t be sure it ever affected the controller itself.

The main problem for us was heat transferring back through the ports to some of the USB power supply chips on the other side that would get to hot and throw an over amp code back to the USB controller / OS and cause a sudden reset. This would happen so often the devices would take themselves offline. The annoying part was this only occurred when I put the lid on the cases.

We were using the onboard wifi as a connection, but when I tried to give them direct Ethernet access it wasn’t able to bring them back online.

See my post here for how we resolved our issue. I am not sure it would work in your case or if we are even having the same problem.

Good luck.
-Thomas

Hello,
thank you @tacLog, your input is very appreciated. I think we might have a similar problem.

Test description
We have done some further more testing yesterday, I would like to share our experience. Our goal was to figure out the temperature where we get a problem. To figure out this we installed two temp probes, one outside and one inside temperature probe.

Test

  • During the test the outside temperature stayed between 27 - 28.5°C.
  • The control box temperature was between 42 (open control box lid) - 46°C (closed control box lid).
  • The device was connected via ethernet.

With an open control box the balenaFin was working. With the lid closed problems started at 42-43°C after 44°C the balenaFin was not responding any more. Once the lid was open again the balenaFin started responding immediately to ICMP packets again and we could access it via SSH. We have done this test multiple times and the behaviour was always the same. (With USB and without USB devices connected).
When we left the device outside of the control box for some time and let it cool down, it was working for approximately 11 minutes after we placed it back inside with a closed control box. After this time we could see the on/off behaviour again, opening and closing the panel.

The device did not reboot itself during the tests, we loose only connection to USB devices and the Ethernet interface during environment temperatures over 42°C.

I have attached a picture from a part of the control box. The Raspberry Pi 3B+ at the uppest layer has done the temperature measurements during our tests.

@curcuz, do you have gotten our logs? Have they been helpful to trace down the root cause?

Hello @Frorh,

Thank you for the extra information. We received the logs and are currently analyzing. We believe the problem might be related to the one Thomas shared above. I will take a closer look at the logs and get back to you with my findings.

Cheers,
Nico.