Raspberry Pi 4 - Keeps restarting upload

walpoletim · October 12, 2019, 2:40pm

Deploys to my Raspberry Pi 4 (Raspberry Pi 4 (using 64bit OS) (BETA)) keep restarting…

If I deploy one image, all is ok, but when I try and deploy multiple images (including 1 large image), the upload of the image never completes, and just goes round and round in a loop uploading the image

sradevski · October 14, 2019, 7:53am

Hi there, that sounds strange indeed, can you enable support access and share the UUID of the device with us? You can do it either in a PM or just paste it here. We will have a look at the logs and see what might be the cause of this.

walpoletim · October 17, 2019, 5:43am

Thanks - I will get a device setup that is reloading constantly for you (currently I have had to deploy by adding one image at a time)… and then send you the UUID…

Much appreciated

walpoletim · October 17, 2019, 3:49pm

Hi

The UUID of the device is
4f61308f597ec8a9a736dd634e49fbbe

Its in restarting mode at the moment - gets 1/2 way there, then restarts all containers at the same time…

chrisys · October 17, 2019, 4:25pm

Hi @walpoletim we’ve been taking a look at your device but it seems incredibly slow to navigate and look at log files. Do you have another SD card you could try?

We’ve run the diagnostic checks on your device and the check_write_latency check is failing, showing slow disk writes. Additionally there are the following errors seen when looking at dmesg:

[167355.950212] INFO: task kworker/2:3:6610 blocked for more than 120 seconds.
[167355.957749]       Tainted: G         C        4.19.66 #1
[167355.964054] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[167355.972792] kworker/2:3     D    0  6610      2 0x00000028
[167355.978810] Workqueue: events_freezable mmc_rescan
[167355.984161] Call trace:
[167355.987126]  __switch_to+0xa8/0xe8
[167355.991862]  __schedule+0x254/0x850
[167355.995572]  schedule+0x38/0x98
[167355.998878]  __mmc_claim_host+0xb8/0x200
[167356.002980]  mmc_get_card+0x38/0x48
[167356.006632]  mmc_sd_detect+0x24/0x90
[167356.010374]  mmc_rescan+0xd0/0x370
[167356.013916]  process_one_work+0x1ec/0x458
[167356.018115]  worker_thread+0x48/0x430
[167356.021927]  kthread+0x130/0x138
[167356.025351]  ret_from_fork+0x10/0x1c

Give another SD card a try and let us know what happens, I hope this helps!

walpoletim · October 17, 2019, 9:26pm

Amazing Thanks - I will get the card swapped and let you know…

Tim

walpoletim · October 18, 2019, 7:16am

Hi

I have swapped the card to a new card, and still the same issue…

This is the new UUID
137e61a08fcb56db13a8fa1f9e5c72d2

Network is fine - we are getting 100Mb constant down on Fibre…

The pi is directly connected to the ethernet switch

zubairlk · October 18, 2019, 9:37am

Hi,

Can you please share which SD card make/model you are using? We recommend the Sandisk Extreme Pro sd cards.

I took a look at the device.
The balenaEngine healthcheck times out (in 6 minutes). This restarts the engine and the supervisor, which restarts the download.

The reason the balenaEngine healthcheck times out feels like a slow sd card.
e.g.

On a balenaFin (which is pi-cm3)

root@7ac2b83:~# time bash -x /usr/lib/balena/balena-healthcheck 
+ set -o errexit
+ balena info
+ balena ps
+ balena run --rm --log-driver none --network none hello-world

real    0m3.959s
user    0m0.643s
sys     0m0.122s
root@7ac2b83:~#

The healthcheck is consistent and <5 seconds…

But on your device, the time can vary and go up quite a bit as well.

root@137e61a:~# time bash -x /usr/lib/balena/balena-healthcheck 
+ set -o errexit
+ balena info
+ balena ps
+ balena run --rm --log-driver none --network none hello-world

real    0m30.664s
user    0m0.233s
sys     0m0.108s
root@137e61a:~# time bash -x /usr/lib/balena/balena-healthcheck 
+ set -o errexit
+ balena info
+ balena ps
+ balena run --rm --log-driver none --network none hello-world

real    0m1.853s
user    0m0.247s
sys     0m0.082s
root@137e61a:~# time bash -x /usr/lib/balena/balena-healthcheck 
+ set -o errexit
+ balena info
+ balena ps
+ balena run --rm --log-driver none --network none hello-world

real    0m1.870s
user    0m0.225s
sys     0m0.131s
root@137e61a:~# time bash -x /usr/lib/balena/balena-healthcheck 
+ set -o errexit
+ balena info
+ balena ps
+ balena run --rm --log-driver none --network none hello-world

real    0m1.898s
user    0m0.261s
sys     0m0.088s
root@137e61a:~# 
root@137e61a:~# time bash -x /usr/lib/balena/balena-healthcheck 
+ set -o errexit
+ balena info
+ balena ps
+ balena run --rm --log-driver none --network none hello-world

real    0m1.868s
user    0m0.256s
sys     0m0.080s
root@137e61a:~# time bash -x /usr/lib/balena/balena-healthcheck 
+ set -o errexit
+ balena info
+ balena ps
+ balena run --rm --log-driver none --network none hello-world

real    0m12.315s
user    0m0.235s
sys     0m0.119s
root@137e61a:~# time bash -x /usr/lib/balena/balena-healthcheck 
+ set -o errexit
+ balena info
+ balena ps
+ balena run --rm --log-driver none --network none hello-world

real    0m14.328s
user    0m0.220s
sys     0m0.114s
root@137e61a:~# time bash -x /usr/lib/balena/balena-healthcheck 
+ set -o errexit
+ balena info
+ balena ps
+ balena run --rm --log-driver none --network none hello-world

real    0m8.249s
user    0m0.236s
sys     0m0.108s
root@137e61a:~#

The pi4 is pretty new and still going through various improvements. So it could be that a firmware bump fixes some sd card read/write. Or it could be the make/model of your sd cards.

Regards
ZubairLK

walpoletim · October 18, 2019, 10:17am

Hi

The Disks are Sandisk extreme (34 gb)

TIm

zubairlk · October 18, 2019, 10:23am

Hmm. This is strange…

walpoletim · October 18, 2019, 10:58am

Any thoughts would be good…

Is there any way of increasing the timeouts…??

Having to deploy container by container is a real pain.

Thanks

Tiim

zubairlk · October 18, 2019, 11:07am

Increasing the timeout at runtime is something on the feature/issue list. but not implemented yet

zubairlk · October 18, 2019, 11:12am

I’ve stopped supervisor and stopped the mongo container.
Our diagnostics still reports slow write latency.

check_write_latency	 Failed	Slow disk writes detected: mmcblk0: 4235.17ms / write, sample size160009 mmcblk0p5: 3593.36ms / write, sample size85 mmcblk0p6: 4236.77ms / write, sample size159875

Still feels like something in this pi4 firmware/kernel/sd card is funny.

zubairlk · October 18, 2019, 11:37am

Could it be a fake sd card?

walpoletim · October 18, 2019, 11:52am

HI

Unlikely - It was purchased in the UK from a camera shop…

As I said I have used this in the PI3, and have no issues…

Would it be worth looking at the Pi3’s logs to see if it showing the same issue…

I have put the old card from the Pi4 into the pi3 so you should be able to see how those images came down this morning…

This is the pi3 uuid
c57355a26f73cb150eb3e9001690bb20

zubairlk · October 18, 2019, 11:54am

Can you push the same mongo application to this pi3?

zubairlk · October 18, 2019, 11:57am

At this point, I suspect it could be a pi4 firmware issue that I’m unaware off. Large network download was something that was being fixed. But I can’t remember when that fix went in

walpoletim · October 18, 2019, 12:41pm

ahhh - that makes sense - it fails more with large containers (one is 2gb as it has rust, python,scipy, numpy and snips_nlp…

What options do we have with the firmware …??

Thanks for your help?

walpoletim · October 18, 2019, 12:45pm

Pushing Mongo now…

xginn8 · October 18, 2019, 3:34pm

Hi there,

Please let us know the results of your test on the RPi3 so we can continue helping you debug this issue.

Topic		Replies	Views
Persistent "Failed to download image due to 'connect ECONNREFUSED /var/run/balena-engine.sock" error Product support	30	2743	April 13, 2022
Failed to download image when disk usage very high Product support	10	942	October 6, 2020
Issues downloading images to boards Product support	20	2460	March 9, 2019
Download error, URL wrong Product support	24	2565	April 28, 2020
Never ending "Downloading image" .... Balena-cloud Product support balenasound	17	1232	September 10, 2020

Raspberry Pi 4 - Keeps restarting upload

Related topics