Fold on Pi 4 keeps restarting the UI

Hi Everyone, I wonder if someone can offer me some advice.

I have just set up the Fold client on a 4GB Pi4 having deployed it via my dashboard as per https://github.com/balenalabs/rosetta-at-home. The only change I have made to this was adding my own Rosetta account ID using the “ACCOUNT_KEY” environment variable before running up the Pi. Everything seems to be going OK in the computational sense but the UI is broken. I cannot connect to the unit’s UI directly or via the public facing IP which reports that a “tunneling socket could not be established: 500”.

Looking at the dashboard I see the UI repeatedly alternating between “Running” and “Downloaded” and the terminal repeatedly showing restarts and disconnects.

25.11.20 09:43:35 (+0000)
25.11.20 09:43:35 (+0000) > foldforcovid-dashboard@1.0.0 start /usr/app/dashboard
25.11.20 09:43:35 (+0000) > node server.js
25.11.20 09:43:35 (+0000)
25.11.20 09:43:37 (+0000) Service exited ‘ui sha256:490(redacted, always the same)174d8’
25.11.20 09:43:37 (+0000) Killed service ‘ui sha256:490(redacted, always the same)174d8’
25.11.20 09:43:38 (+0000) Installing service ‘ui sha256:490(redacted, always the same)174d8’
25.11.20 09:43:39 (+0000) Installed service ‘ui sha256:490(redacted, always the same)174d8’
25.11.20 09:43:39 (+0000) Starting service ‘ui sha256:490(redacted, always the same)174d8’
25.11.20 09:43:42 (+0000) Started service ‘ui sha256:490(redacted, always the same)174d8’
25.11.20 09:43:42 (+0000) 2020/11/25 09:43:42 Permitting clients to write input to the PTY.
25.11.20 09:43:42 (+0000) 2020/11/25 09:43:42 Server is starting with command: boinctui
25.11.20 09:43:42 (+0000) 2020/11/25 09:43:42 URL: http://127.0.0.1:8080/
25.11.20 09:43:42 (+0000) 2020/11/25 09:43:42 URL: (redacted)/
25.11.20 09:43:42 (+0000) 2020/11/25 09:43:42 URL: (redacted)/
25.11.20 09:43:43 (+0000) Killing service ‘ui sha256:490(redacted, always the same)174d8’
25.11.20 09:43:44 (+0000)
25.11.20 09:43:44 (+0000) > foldforcovid-dashboard@1.0.0 start /usr/app/dashboard
25.11.20 09:43:44 (+0000) > node server.js
25.11.20 09:43:44 (+0000)
25.11.20 09:43:46 (+0000) Service exited ‘ui sha256:490(redacted, always the same)174d8’
25.11.20 09:43:46 (+0000) Killed service ‘ui sha256:490(redacted, always the same)174d8’
25.11.20 09:43:48 (+0000) Installing service ‘ui sha256:490(redacted, always the same)174d8’
25.11.20 09:43:48 (+0000) Installed service ‘ui sha256:490(redacted, always the same)174d8’
25.11.20 09:43:48 (+0000) Starting service ‘ui sha256:490(redacted, always the same)174d8’
25.11.20 09:43:50 (+0000) Started service ‘ui sha256:490(redacted, always the same)174d8’
25.11.20 09:43:50 (+0000) 2020/11/25 09:43:50 Permitting clients to write input to the PTY.
25.11.20 09:43:50 (+0000) 2020/11/25 09:43:50 Server is starting with command: boinctui
25.11.20 09:43:50 (+0000) 2020/11/25 09:43:50 URL: http://127.0.0.1:8080/
25.11.20 09:43:50 (+0000) 2020/11/25 09:43:50 URL: (redacted)/
25.11.20 09:43:50 (+0000) 2020/11/25 09:43:50 URL: (redacted)/
25.11.20 09:43:52 (+0000) Killing service ‘ui sha256:490(redacted, always the same)174d8’
25.11.20 09:43:52 (+0000)

… over and over again. Trying to stop then restart the UI service from the dashboard results in the service becoming completely inoperable until a reboot when the whole process starts again.

My Fold instance resulted in a host version of balenaOS 2.60.1+rev5, Supervisor 11.14.0, target release f79ec6d. There are no errors or warnings in my health checks or device diagnostics. I am seeing credits appearing against my Rosetta account so the number crunching seems to be OK, it’s just the UI that appears to be broken out of the box.

CPU is at approx 79%, temperature is OK at 69 degrees, memory is 3.1 / 3.8GB and storage 2.6 GB / 27.8 GB.
Nothing else is running on this Pi at all.

Please, any thoughts or suggestions would be welcomed.

Hi @GOTO_GOSUB, thanks for setting up the Fold client.

From your description, indeed it sounds like the issue is limited to the UI part of the application. Sounds like we have more users experiencing this issue as I found it reported in the project’s repository site: https://github.com/balenalabs/rosetta-at-home/issues/55

On the device dashboard, can you go to Diagnostics page and Run checks?
It would be good know if a check is failing.

Furthermore, can you open SSH access to HostOS and run the following command to retrieve the supervisor logs?
journalctl -fn 100 -u resin-supervisor

Hi @gelbal, thank you for the prompt response.

The diagnostics page does not indicate any errors under “Health Checks” and the results of the “Device Diagnostics” is huge. Therefore I have placed the results of those into text files and placed them on my Google Drive, here:

Also in that folder you will find the results of the command line instruction you requested.

I hope you find something useful in those files - please let me know if I can do anything else to help you identify the problem.

Thank you for sharing the log files - they were useful :+1:
A colleague (@pipex) was able to reproduce the issue, and the balena team is investigating the root cause in order to fix it. Meanwhile, in case you are interested in a workaround step, we found that the problem is avoided by adding cgroup_enable=memory to the cmdline.txt file. This text file can be found in the resin-boot partition of the SD card after it is flashed. Take the SD card out of the Pi, connect the SD card to a laptop / desktop computer, and edit the cmdline.txt file in the resin-boot partition. (This is easy on macOS, as the Finder app automatically mounts the resin-boot partition. On Linux and Windows, additional steps are required - let us know if additional help is needed.) Then insert the SD card to the Pi 4 again.

Hi @pdcastro, thank you for taking a look at my log files.

I reckon I can make the required change to the cmdline.txt file so I will give that a go and report back with my findings shortly.

Thank you @pdcastro, your workaround fix is working for me. :clap:

Hi, just to let you know that we have opened a ticket for this issue (https://github.com/balena-os/balena-raspberrypi/pull/575) and we will notify this thread once it is fixed. We will just add the cgroup_enable=memory kernel command line argument by default, which is the same you are now using as a workaround.
Thanks for letting us know about this problem.

Thank you for the update, @alexgg. It looks like work is happening on this today.