IoT2040 running Balena OS 2.12.7+rev1 stop working after 2 days

My device IoT2040 running Balena OS 2.12.7+rev1 stop working after around 2 days.
It disconnected from the Balena Dashboard. But it came back after reset the device.

It happen like this for all 3 installed devices.

Anyone facing the problem like this?

Hi, do you have any error logs or diagnostics you can share with us?

Hi,
Now all devices were installed on site. It’s a little bit hard to get any logs. Because when it stop working, I can just only someone there reset it. Then all logs gone.

I’m waiting to get another IoT2040 to test at locally. Then I will come back to you with the logs.

Beset regards,

Hi Lucianbuzzo,

Is it possible that you can remote to my device and see what happen there.
Because now it’s a urgently thing for us. Any charge is acceptable.
After all problem solved in my system. Then this system will span out a lot.

Best regards,

This is the link to my device. I have already grant the access.

https://dashboard.balena-cloud.com/devices/5535031758354b8023db7effb5d4cc1a/summary

Hi, I can’t access the device, can you make sure that support access is enabled?

I have already granted.

Can you try again with this link?
https://dashboard.balena-cloud.com/devices/5535031758354b8023db7effb5d4cc1a/summary

I can see it now, thanks. I will check it out.

You can stop the application if you needed.
Thank you so much.

Hi there, I couldn’t find anything regarding to what might have caused the device going offline. I would recommend enabling persistent logging and letting us know if this happens again, so we can see more information on what happened before the device went offline.

Hi Sradevski,

Could you suggest me how to enable persistent logging?

Best regards

Yes, to enable it for the entire fleet in an application, you can go to “Fleet configuration” -> “Enable persistent logging”. If you want to do it for a single device, you can navigate to the device, and then “Device configuration” -> “Enable persistent logging”.

Hi Sradevski,

Please see the answer of the question is below.

What exactly do you have to do in order to “reset” a device?

  • Disconnected the power and re-connect the power.

Are those devices pingable on their local IP when they go silent?

  • I cannot check it, because the device install on site. And the people who help me reset it have no skill to check this. I will check it by myself after I setup the test base at my home.

What do you see in the dashboard for those devices when this happens?

  • Device status show “Offline” and last online xxx hours ago.

Do you have screenshot of device status in the dashboard from the time?

Hi, we have a scheduled discussion on upgrading the belanaOS to a latest version for this device type, since it did not received any update for a long time now. We will update you as soon as we have an answer on how we are going to proceed with it.
Thanks,
Zahari

Hi Zahari,

Thank you so much for your answer and updating.

In the update, could you please working on “switchserialmode” command as I asked on another post? Because I think the one of the main point that people use IoT2040 because they need to run it on industrial and connect to the machine using MODBUS.

And build-in some of famous USB to Serial driver such as FTDI, CP210x, CH341 on it? Because I prefer to use the image that provided from Balena instead of build my own image.

Best regards,

Hi,
I have one more information for you.
The period before stop working time of 2 different sites is quite same.
For these 2 devices, I have rebooted them from the Dashboard at the same time ( 14/11/2019 6:04pm +0800 ). Then I have just checked it and found that it stop working at the same time.
Checking time in the screenshot is 16/11/2019 11:30am +0800.
Please see the picture below.

Dear Balena team,

I have more information to update.
Today I let technician go there for checking the system after it went offline.
The status of the IoT2040 is below.

  • Power and USB indicator both on.
  • The indicator on ethernet connected to X1P1LAN : Green solid on, Orange solid off.
  • The application still running, checked by see on MODBUS polling still working.
  • Ping to IoT2040 failed.

Then I let him try to unplug the LAN cable and plug it back again.
The result :

  • Indicator on X1P1LAN : Green solid on, Orange always off.
  • Still failed to ping to IoT2040.

Then I let him try to reset the 4G router (While IoT2040 still running) that IoT2040 connected to by unplug the power cable and re-insert the power cable on 4G router.
The result :

  • Indicator on X1P1LAN : Green solid on, Orange always off.
  • Still failed to ping to IoT2040.

Then I let him unplug the LAN cable from X1P1LAN and connect to another port X2P1LAN of the IoT2040.
The result :

  • Indicator on X2P1LAN : Green solid on, Orange blinking.
  • Successful ping to IoT2040.
  • The device come back to Online ( see on dashboard )

Then I let him unplug the LAN cable from X2P1LAN and connect it back to port X1P1LAN of the IoT2040.
The result :

  • Indicator on X1P1LAN : Green solid on, Orange blinking.
  • Successful ping to IoT2040.
  • The device Online ( see on dashboard )

Then I remoted to the device via SSH on dashboard.
I found that :

  • The application still running well.
  • The device running well without any reset since last manual reset by human. (Check by command uptime)

I have upload all logs that I know how to get with this post.
Please check the logs.

boiler002_uptime_2019-11-18_1252 0700.log (108 Bytes) boiler002_balena_logs_2019-11-18_1250 0700.log (137.5 KB) boiler002_dmesg_log_2019-11-18_1246 0700.log (39.0 KB)

Hi,

Thank you very much for the logs, this is very useful. It appears from the kernel log that the X1P1LAN link physically went down, although obviously we don’t know why. Given that this is happening consistently for you on multiple devices after 2 days, it sounds like this may potentially be an OS issue. Unfortunately, as you know, you’re on the latest version of the OS.

I’ve looked into the device (https://dashboard.balena-cloud.com/devices/5535031758354b8023db7effb5d4cc1a/summary) this morning, but it appears to have been rebooted 4 hours ago so I can’t actually glean much from the logs. Would you mind enabling persistent logs on this device ASAP, so that should it go offline again logs are stored that we could potentially look at later?

We’re due to have a discussion this week about IoT20xx support and we will be raising the issue of moving to a newer OS. We’ll obviously let you know as soon as we have more information.

Thanks and best regards,

Heds

Hi Heds,

I have enabled the persistent logging on this device.
But it seem the supervisor still haven’t support for the persistent logging.
Supervisor version on the device is 7.4.3.

Best regards,
Burin Sapsiri

Hi Heds,

So any logs that you needed? Please tell me how to get it.
When the problem occur next time, I will get that logs for you.

Best regards,
Burin Sapsiri