Trouble with openbalena after internet outage

Ok,
The devel vs production image usage concept has been cleart - we will remain on production images :slight_smile:

Also, glad that

> balena ssh uuid

now works (which probably makes this explanation not relevant any more: HowTo: SSH into host device )

The openBalena server has been deployed with docker-compose…
Here are the stats: http://prntscr.com/q323xn - they seem pretty normal/sane to me …
After running stats for several minutes - the only spike i saw at:
70efdf44e81c openbalena_s3_1 ~3 to 5 %

In docker logs most of them had nothing special - the only ones with something informative is:

DB:
2019-11-27 17:46:44.466 UTC [1] LOG: listening on IPv4 address “0.0.0.0”, port 5432
2019-11-27 17:46:44.466 UTC [1] LOG: listening on IPv6 address “::”, port 5432
2019-11-27 17:46:44.590 UTC [1] LOG: listening on Unix socket “/var/run/postgresql/.s.PGSQL.5432”
2019-11-27 17:46:44.842 UTC [23] LOG: database system was interrupted; last known up at 2019-11-26 01:14:29 UTC
2019-11-27 17:46:45.394 UTC [23] LOG: database system was not properly shut down; automatic recovery in progress
2019-11-27 17:46:45.509 UTC [23] LOG: redo starts at 0/132FAFD0
2019-11-27 17:46:45.644 UTC [23] LOG: invalid record length at 0/132FC8F8: wanted 24, got 0
2019-11-27 17:46:45.644 UTC [23] LOG: redo done at 0/132FC8C0
2019-11-27 17:46:45.644 UTC [23] LOG: last completed transaction was at log time 2019-11-26 01:18:13.78291+00
2019-11-27 17:46:49.101 UTC [1] LOG: database system is ready to accept connections
2019-11-27 17:47:07.200 UTC [30] ERROR: relation “uniq_model_model_type_vocab” already exists
2019-11-27 17:47:07.200 UTC [30] STATEMENT: CREATE UNIQUE INDEX “uniq_model_model_type_vocab” ON “model” (“is of-vocabulary”, “model type”);

cert-provider:
[Error] ACTIVE variable is not enabled. Value should be “true” or “yes” to continue.
[Error] Unable to continue due to misconfiguration. See errors above. [Stopping]
[Error] ACTIVE variable is not enabled. Value should be “true” or “yes” to continue.
[Error] Unable to continue due to misconfiguration. See errors above. [Stopping]
[Error] ACTIVE variable is not enabled. Value should be “true” or “yes” to continue.
[Error] Unable to continue due to misconfiguration. See errors above. [Stopping]
[Error] ACTIVE variable is not enabled. Value should be “true” or “yes” to continue.
[Error] Unable to continue due to misconfiguration. See errors above. [Stopping]

haproxy:
Building certificate from environment variables…
Setting up watches. Beware: since -r was given, this may take a while!
Watches established.
[NOTICE] 330/174655 (15) : New worker #1 (17) forked
[WARNING] 330/174655 (17) : Server backend_api/balena_api_1 is DOWN, reason: Layer4 connection problem, info: “Connection refused”, check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[ALERT] 330/174655 (17) : backend ‘backend_api’ has no server available!
[WARNING] 330/174656 (17) : Server backend_registry/balena_registry_1 is DOWN, reason: Layer4 connection problem, info: “Connection refused”, check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[ALERT] 330/174656 (17) : backend ‘backend_registry’ has no server available!
[WARNING] 330/174656 (17) : Server backend_vpn/balena_vpn_1 is DOWN, reason: Layer4 connection problem, info: “Connection refused”, check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[ALERT] 330/174656 (17) : backend ‘backend_vpn’ has no server available!
[WARNING] 330/174656 (17) : Server backend_s3/balena_s3_1 is DOWN, reason: Layer4 connection problem, info: “Connection refused”, check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[ALERT] 330/174656 (17) : backend ‘backend_s3’ has no server available!
[WARNING] 330/174657 (17) : Server vpn-tunnel/balena_vpn is DOWN, reason: Layer4 connection problem, info: “Connection refused”, check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[ALERT] 330/174657 (17) : proxy ‘vpn-tunnel’ has no server available!
[WARNING] 330/174700 (17) : Server backend_vpn/balena_vpn_1 is UP, reason: Layer4 check passed, check duration: 0ms. 1 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.
[WARNING] 330/174703 (17) : Server vpn-tunnel/balena_vpn is UP, reason: Layer4 check passed, check duration: 0ms. 1 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.
[WARNING] 330/174704 (17) : Server backend_registry/balena_registry_1 is UP, reason: Layer4 check passed, check duration: 0ms. 1 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.
[WARNING] 330/174713 (17) : Server backend_api/balena_api_1 is UP, reason: Layer4 check passed, check duration: 0ms. 1 active and 0 backup servers online. 0 sessions requeued, 0 total in queue.

Seeing this, most of them are connected to the reboot of the openBalena VM because it exhausted the memory…
Not sure if this gives any idea on the matter…