After update, access to I2C interface is lost

Hi there

I just updated an application yesterday, but after the update, I2C stopped working on one of the devices. Every time a service tries to access I2C, the following error message is produced:
No such file or directory: ‘/dev/i2c-1’

This error message is produced whether my Python service is trying to access the I2C interface, or it is manually used from the command line by i2c-tools:
i2cdetect -y 1

It runs on a Raspberry Pi 3, and was working fine before the update.
However, I have another device running the same release on the exact same hardware, where everything is working fine, also after the update.

I tried rebooting, and running:
modprobe i2c_dev
modprobe i2c_bcm2835

but still no luck. lsmod|grep i2c produces the following:
i2c_bcm2835 16384 0
i2c_bcm2708 16384 0
i2c_dev 16384 0

I’m stuck here, so what would be the next step to fix the problem?
I really hope you can help!

Best regards
Johan

Setup info:
Hardware: : Raspberry Pi 3 Model B
OS-version : balenaOS 2.38.0+rev1
Supervisor-version: 9.15.7

Hi, what kind of update did you perform? Did you update balenaOS or the application code itself?
Did you already try switching the hardware between the two devices?

Hi @thundron, and thanks for the reply!
I updated the application code, and then I actually also moved the device from one application to the current (about the same time, as I pushed the update).
After being moved and updated, everything else than I2C seem to be working fine.

The device is deployed in the field, so I cannot get physical access to the device to try to switch hardware.

Hi @JohanEThomsen, could you please enable support access and share the device URL?

If you’d like to keep the device URL private, please send it via private message.

It’s strange that the same application code runs fine on another device with exact same hardware. The issue might be related to the application migration. It’d be good to check the device logs and see if they provide more context.

Hi again

Yes I just enabled it now. I sent you the URL in a private message.
It’s much appreciated!

Hello again @JohanEThomsen, checking the device’s kernel logs, I saw under-voltage errors happening frequently:

[47439.673827] brcmfmac: power management disabled
[47505.405136] Under-voltage detected! (0x00050005)
[47509.565160] Voltage normalised (0x00000000)
[47694.685710] Under-voltage detected! (0x00050005)
[47698.845715] Voltage normalised (0x00000000)

Then when I was running a test to detect possible SD card corruption, your device went offline. Unfortunately I cannot access it anymore. Do you have physical access to the device? Could you please power cycle it?

My theory here is this device was under powered for a while. The under-voltage caused SD card corruption and I2C related libraries were corrupt. If this is true, flashing a new SD card and powering that device under correct voltage would fix this issue. It’d be great if you could try it out.

If the device gets online, we could have a further look at the logs and try other ways to detect SD card corruption.

Hi again

Ah okay, that might be part of the issue.
I just checked, and it should be online again now.

No, the thing is that the device is deployed in a site far away, so I don’t have access to it.
How do you check for SD card corruption, and is there any way to fix it remotely?

Best regards

Hello, we tried running the corruption check another couple of times but it either closes the ssh connection or crashes the device every time. As Firat mentioned the root cause of this might be the under-voltage which may cause the SD card to be corrupt. If that is the case there is not much that can be done remotely as the SD card needs to be replaced and its also probably worth checking if the power supply is respecting the required specs. On the other hand I don’t see the main service erroring on i2c anymore, did you change something in the code for that device?

Okay now I tried moving the device again to another application, and now everything is working again!

It would make sense, that voltage errors could result in corrupted files, but my guess is, that it was not the case here as an application move fixed the issue.
Do you have an idea of what happens under the hood when a device is moved, which could fix this issue? Just to get an idea how to counter-act it in the future.

The main service erroring was stopped because I changed the application code, such that the container did not keep restarting, preventing me from getting terminal access to the service (simply by putting it in an infinite loop in the startup script).

If you check the two application’s Fleet Configuration page, and the “Define DT parameters [RESIN_HOST_CONFIG_dtparam]” section, what’s the value set there? Is that the same? Is there any other Fleet configuration that is different for the two application?

The I2C settings on the Raspberry Pi :raspberrypi: are enabled through those configurations, and that would be my suspect, that one application as it empty, while the other has a value such as "i2c_arm=on","spi=on","audio=on" for example, which is a quite common setting, and the first one of those (or the first 2, maybe) enables I2C functionality properly.

Thus would recommend looking at the Fleet and Device configuration pages, and comparing them, maybe that will be a hint? If not, we can try to look further.