High temperature issue on Fin V1.1.

Hi,

I was using same Fin board for the last 5 months with Raspbian with no issues. Today when starting it up, I noticed the thermometer which indicates the high temp of CPU after only 1 minute of running and without starting any script. After the second minute, the red thermometer goes fully up and indicated temperature is 102 degrees Celsius.

2020-02-24_1016

What do you recommend me to do?

Thanks

Hello, That does indeed seem very unusual. For first steps:

  • Could you please confirm the temperature with the following (in host os) : cat /sys/class/thermal/thermal_zone0/temp
  • Could you also run top/htop etc. to confirm that there are no processes running with very high CPU demand.
    Thank you

Hi @srlowe,

Thanks for the fast response. I have attached some print screens:

Thanks for the extra info confirming the situation. I’ll discuss this with my colleagues and get back to you shortly.

Here are some other suggestions. Apologies if some are obvious, but better to be thorough!

  • Has an enclosure been added to the device recently?
  • Could you check that nothing has got inside the enclosure that may be hindering airflow?
  • Do you know the approx ambient temperature where the device is located?
  • Are you able to confirm that the device is actually hot? (to rule out sensor fault)

Another line of thought is Raspbian (e.g. perhaps this is related to a recent update). Is there any way that you are able to temporarily boot into a different OS? (e.g. balenaOS)

  • Has an enclosure been added to the device recently?
  • Nope. It has been in the same enclosure all the time. The enclosure is the original one which Fin comes with. Moreover, I have not placed the top lid.
  • Could you check that nothing has got inside the enclosure that may be hindering airflow?
  • Nothing has got inside for sure. It was working perfectly last time.
  • Do you know the approx ambient temperature where the device is located?
  • 10-20 degrees Celsius
  • Are you able to confirm that the device is actually hot? (to rule out sensor fault)
  • Yes, the device gets hot. Especially the Broadcom chip from CM3

Another line of thought is Raspbian (e.g. perhaps this is related to a recent update). Is there any way that you are able to temporarily boot into a different OS? (e.g. balenaOS)

  • I can’t boot since I only have raspian but could flash balenaOS on it. I am afraid keeping it running for such long time (as long as the flashing lasts), will burn it completely.

With the basic troubleshooting out of the way, it certainly seems possible that something has happened to the board and we are happy to swap it for a new one. Be on the lookout for an RMA form to your email address, and can you please also include the Compute Module with the return, so that we can investigate which part may have failed. Thanks!

Hi @dtischler,

Sounds great!

Thanks!

Hello @dtischler , @srlowe

Thanks for the replacement of the carrier board. But it seems like the problem won’t go away. This time, I am stopping the power from the device as fast as I see the temperature indicator coming up. Even though, the CM3L modules still become unusable after that point.

Now that this issue have repeated itself and with all the background of the discussion, I want to add something new to the story. This is something that I noticed along the way, after losing 5 compute modules.

We are using 8 closing loop magnets that run on externally 5v power supplier. They are responsible to close the electrical circuit when they get in contact with its pair magnet, let’s call it, “brother magnet”.

Until here we have 8 magnets with 5v power and 8 “brother magnets” which are simply magnets, no wires and no power is applied to them.

Imagine that the magnet that is powered with 5v is in a fixed position and its “brother magnet” is on a sliding device. Whenever a pair of powered magnet + its “brother magnet” meet each other, the electrical circuit is closed and the GPIO pin connected to the wired magnet receives 5v. Now imagine 8 such pairs, all giving 5v to their corresponding GPIO’s at the same time.

Here is the mapping of the GPIO’s that receive 5v:

  • PIN 8: GPIO14 --> Magnet
  • PIN 11: GPIO17 --> Magnet
  • PIN 12: GPIO18 --> Magnet
  • PIN 13: GPIO27 --> Magnet
  • PIN 15: GPIO22 --> Magnet
  • PIN 16: GPIO23 --> Magnet
  • PIN 26: GPIO7 --> Magnet
  • PIN 40: GPIO21 --> Magnet

A note here is that the Fin is NOT powered from its original power supplier, but from a 24v 10A power supplier.

Can the incoming 5v voltage setup of the pins above damage the CM3L?

Or maybe the multiple switching between 5v and 0v depending if the magnets are meeting their “brother magnet” or not?

Today I noticed that the incoming signal from 2 magnets was not received in the application before the temperature indicator showed up. This makes me believe that the signal from these magnets are causing some damage.

Replacing the magnets is not an option due to existing hardware configuration.

Any other thoughts or ideas of things that I could try to debug this issue of CM3L getting warmed up? Or maybe a GPIO pin monitoring solution?

Thanks and best regards!

Hello @sicabboy,

I’m not sure I fully understand your setup. It would help if you could provide some pictures, you can send them to fin@balena.io if you’re not comfortable sharing them publicly.

Having said that, if you’re applying 5V directly to the CM3L gpio, you’re definitely damaging it. The CM3L (and CM3+L) only support up to 3.3V logic.

Before anything else, I’d try adding level shifting circuitry to adapt the 5V to the 3.3V logic of the compute module. Let us know if you need any help with that.

Cheers,
Nico.

Hello,

Then it means I wasn’t paying enough attention to the 3.3V logic part from this answer here. It states there about having 5v maximum, therfore I used 5v as an input.

Lesson learned.

Thank you