Both my RPi4 2GB's keep resetting their calculations 4GB runs fine

@vinntec I rebooted mine, came back nicely calculating 4 tasks.
I suspended the last one and also the 2 on wait that were started automatically so it is back running 3 tasks.

OK the ones running only three I will leave as is for now. The others I have reduced to three tasks as you suggested. Let’s see if that makes the 4GB ones more stable. Thanks

Rather bizarrely, since the last reboot the RPi4 4GB with the monitor attached is now showing offline on the monitor - but running 4 jobs with 5 in the queue on the web interface!

I can’t see anything to say why they might reboot from time to time. The rack is not getting hot although I only have the fans in relatively slow speed (because full speed sounds like a helicopter taking off!)

I also see lots of messages of tasks being completed and results uploaded, but the task then disappears and doesn’t appear to get counted. If this is just the interface that is wrong, I can live with that and leave things running.

@vinntec Mine was dropped offline too, both local 7" display and browser (connection not allowed) while running 3 tasks and 4 suspended tasks lined-up. Progress shown before browser screen refresh and on the local display was less then 1 percent for all tasks. A reboot brought it back online and showed some 20% progress on all tasks it had been/was working on so it HAD been calculating it’s 3 tasks while not connected to the outside world.

The RPi4 with the monitor finally showed a case where the web interface was disabled, by showing that at the same time the system was offline with a load of jobs running, but no progress being shown.

So I have now turned the fans in the rack onto full (only two settings) in case it is heat related, although there was no evidence of hot RPi. Since then the web interface is going offline much more often but the display restores fairly quickly. It also appears that they are not [yet] rebooting so it will be interesting to see what happens in the long run.

The trouble is that the fans now sound like a hovercraft taking off!

@vinntec don’t you have larger fan’s, like intel CPU cooling fan’s spare somewhere? They produce a lot of wind with little noise if you put a resistor in series (trail/error for size)

@pe0ter I can put up with the noise.

The interesting thing is that since I turned the fans up, the 4 x RPi4 4GB are not rebooting anymore. They drop the web interface and sometimes turn the network adapter off but then continue quite happily.

The 1/3 RPi3 which has a job reboots every 5 mins or so, but is doing its best to finish the task. This is probably memory size related.

Good morning @vinntec, this morning my RPi4-4G reboots itself about every 5 min. Before it does, it’s screen becomes like in The Matrix movies, vertical blur, then black, then reboot. It has decided it needs to run 4 tasks and 6 in wait position.
I don’t understand why, when a task takes some 7 hours to calculate, a follow-up task needs to wait in line. Downloading tasks takes about 1 min and the waiting tasks claim resources.

@chrisys @vinntec @dtischler I have taken my RPi4-4GB off the Rosetta project, impossible to keep it running proper without continuous interfering, my 3 RPi4-2GB units perform flawless now and will keep on running.

@pe0ter we merged improvements earlier today which will probably be deployed to the project later today or tomorrow which will address the UI stability issues mentioned further up in the thread.

I’m interested to know what you mean about the waiting tasks consuming resources? It’s my understanding that if the host is seen to be performing well then the project will allocate more tasks and queue them up so that host is never left out of work (not 100% on where I read this, so may not be accurate).

I’ve just made another PR to reduce the setting for Pi 4 4GB devices to 3 tasks as well. https://github.com/balenalabs/rosetta-at-home/pull/48

This is likely to be merged and deployed tomorrow :+1:

@chrisys And how DOES the server see the host is performing well and shovels more tasks to it when I see it is very instable and keeps on rebooting. One time I had it 4 tasks running and 6 in waiting. What is the use of so many tasks waiting when calculations take 7 hours… The thing is dead-slow when it comes to calculating when I compare it with my I7 with GTX1060 graphic card running Folding@home. And Folding@home does not pre-deploy tasks…

From Adafruit on PI heat sinks " Note that you do not need a heatsink for the Pi 3 or 4, it will automatically adjust its speed to avoid overheating!" and “we ran ‘stress -c 4’ on the Pi 3 to run all 4 CPUs, a heat-sink-less Pi quickly dropped down to 0.96 GHz to maintain thermal equilibrium. With this heat sink installed, it could run at 1.16GHz (so, 20% higher performance)” So what is happening is your PI is speeding up and able to service the web interface more quickly (probably). https://www.adafruit.com/product/3082