[Troubleshooting] HAProxy backend servers down

'lo all:

I’ve been smashing my head against the wall the past couple of days and was hoping one of you could help.

I’ve been attempting to follow the openBalena Getting Started guide on a VPS. I’m consistently running into issues surrounding haproxy, and I haven’t been able to find much through Googling.

The immediate issue seems to be a misconfiguration of haproxy, with outputs such as the following from ./scripts/compose up:

haproxy_1_73e64c768a62 | [WARNING] 133/204330 (17) : Server backend_api/resin_api_1 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
haproxy_1_73e64c768a62 | [ALERT] 133/204330 (17) : backend 'backend_api' has no server available!

I wonder if the issue doesn’t lie elsewhere, since the api container doesn’t actually appear to be doing anything. Running ./scripts/compose exec api journalctl --no-pager (as per this forum post) results in No journal files were found.

I’m pretty sure my DNS is configured correctly; I’ve set all the relevant subdomain CNAMEs and nslookup shows them as having properly propagated.

The only configuration change I’ve made from the defaults is setting ACTIVE: "true" in ./compose/services.yml, as cert-provider will complain and exit otherwise. I’ve tested with and without this change though, and haproxy doesn’t seem to care in either case.

I’ve posted the outputs of a couple of commands below in hopes that they’ll be useful. Any help would be greatly appreciated.

Thanks!


balena@vps:~/open-balena$ dig api.<MYDOMAIN>

; <<>> DiG 9.11.3-1ubuntu1.7-Ubuntu <<>> api.<MYDOMAIN>
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 11557
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;api.<MYDOMAIN>.       IN      A

;; ANSWER SECTION:
api.<MYDOMAIN>. 108    IN      CNAME   <MYDOMAIN>.
<MYDOMAIN>.    108     IN      A       <MYIP>

;; Query time: 6 msec
;; SERVER: 8.8.4.4#53(8.8.4.4)
;; WHEN: Tue May 14 21:19:33 UTC 2019
;; MSG SIZE  rcvd: 82

.

The following is the full output of ./scripts/compose up:

Starting openbalena_db_1_cd9595acbb38            ... done
Starting openbalena_cert-provider_1_b28198981d70 ... done
Starting openbalena_redis_1_a3ba71d90e7b         ... done
Starting openbalena_s3_1_608b450642cc            ... done
Starting openbalena_api_1_67b575c88613           ... done
Starting openbalena_vpn_1_f83383c7ec5b           ... done
Starting openbalena_registry_1_e3faba8d4340      ... done
Starting openbalena_haproxy_1_73e64c768a62       ... done
Attaching to openbalena_db_1_cd9595acbb38, openbalena_s3_1_608b450642cc, openbalena_cert-provider_1_b28198981d70, openbalena_redis_1_a3ba71d90e7b, openbalena_api_1_67b575c88613, openbalena_registry_1_e3faba8d4340, openbalena_vpn_1_f83383c7ec5b, openbalena_haproxy_1_73e64c768a62
db_1_cd9595acbb38 | 2019-05-14 20:43:17.977 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
db_1_cd9595acbb38 | 2019-05-14 20:43:17.978 UTC [1] LOG:  listening on IPv6 address "::", port 5432
db_1_cd9595acbb38 | 2019-05-14 20:43:17.996 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql    /.s.PGSQL.5432"
db_1_cd9595acbb38 | 2019-05-14 20:43:18.901 UTC [22] LOG:  database system was interrupted; last known up at 2019-05-14 20:42:03 UTC
db_1_cd9595acbb38 | 2019-05-14 20:43:19.013 UTC [22] LOG:  database system was not properly shut down; automatic recovery in progress
db_1_cd9595acbb38 | 2019-05-14 20:43:19.016 UTC [22] LOG:  redo starts at 0/1638190
db_1_cd9595acbb38 | 2019-05-14 20:43:19.017 UTC [22] LOG:  invalid record length at 0/16381C8: wanted 24, got 0
db_1_cd9595acbb38 | 2019-05-14 20:43:19.017 UTC [22] LOG:  redo done at 0/1638190
db_1_cd9595acbb38 | 2019-05-14 20:43:19.492 UTC [1] LOG:  database system is ready to accept connections
s3_1_608b450642cc | Systemd init system enabled.
cert-provider_1_b28198981d70 | [Info] VALIDATION not set. Using default: http-01
cert-provider_1_b28198981d70 | [Info] Waiting for api.<MYDOMAIN> to be available via HTTP...
cert-provider_1_b28198981d70 | (1/6) Retrying in 5 seconds...
cert-provider_1_b28198981d70 | (2/6) Retrying in 5 seconds...
redis_1_a3ba71d90e7b | 1:C 14 May 2019 20:43:20.535 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
redis_1_a3ba71d90e7b | 1:C 14 May 2019 20:43:20.535 # Redis version=5.0.4, bits=64, commit=00000000, modified=0, pid=1, just started
redis_1_a3ba71d90e7b | 1:C 14 May 2019 20:43:20.535 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf
redis_1_a3ba71d90e7b | 1:M 14 May 2019 20:43:20.596 * Running mode=standalone, port=6379.
redis_1_a3ba71d90e7b | 1:M 14 May 2019 20:43:20.596 # Server initialized
redis_1_a3ba71d90e7b | 1:M 14 May 2019 20:43:20.596 * DB loaded from disk: 0.000 seconds
redis_1_a3ba71d90e7b | 1:M 14 May 2019 20:43:20.596 * Ready to accept connections
api_1_67b575c88613 | Systemd init system enabled.
registry_1_e3faba8d4340 | Systemd init system enabled.
vpn_1_f83383c7ec5b | Systemd init system enabled.
haproxy_1_73e64c768a62 | Building certificate from environment variables...
haproxy_1_73e64c768a62 | Setting up watches.  Beware: since -r was given, this may take a while!
haproxy_1_73e64c768a62 | Watches established.
haproxy_1_73e64c768a62 | [NOTICE] 133/204330 (15) : New worker #1 (17) forked
haproxy_1_73e64c768a62 | [WARNING] 133/204330 (17) : Server backend_api/resin_api_1 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
haproxy_1_73e64c768a62 | [ALERT] 133/204330 (17) : backend 'backend_api' has no server available!
haproxy_1_73e64c768a62 | [WARNING] 133/204330 (17) : Server backend_registry/resin_registry_1 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
haproxy_1_73e64c768a62 | [ALERT] 133/204330 (17) : backend 'backend_registry' has no server available!
haproxy_1_73e64c768a62 | [WARNING] 133/204330 (17) : Server backend_vpn/resin_vpn_1 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
haproxy_1_73e64c768a62 | [ALERT] 133/204330 (17) : backend 'backend_vpn' has no server available!
cert-provider_1_b28198981d70 | (3/6) Retrying in 5 seconds...
haproxy_1_73e64c768a62 | [WARNING] 133/204331 (17) : Server vpn-tunnel/balena_vpn is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
haproxy_1_73e64c768a62 | [ALERT] 133/204331 (17) : proxy 'vpn-tunnel' has no server available!
cert-provider_1_b28198981d70 | (4/6) Retrying in 5 seconds...
cert-provider_1_b28198981d70 | (5/6) Retrying in 5 seconds...
cert-provider_1_b28198981d70 | (6/6) Retrying in 5 seconds...
cert-provider_1_b28198981d70 | [Error] Unable to access api.<MYDOMAIN> on port 80. This is needed for certificate validation. [Stopping]

Hey, what provider’s VPS are you trying to deploy this?

Hey, you only need to set ACTIVE = true if you plan to use a cert provider such as Let’s Encrypt. The logs you see that suggest an error condition when that variable is “false” are innocuous and can be ignored. Can you try again with ACTIVE = false?

@imrehg VPS is through Hostinger, DNS through Dynadot.

@dfunckt Ah gotcha. Setting ACTIVE = false results in much the same output though, save cert-provider’s complaints:

Starting openbalena_s3_1_608b450642cc            ... done
Starting openbalena_db_1_cd9595acbb38            ... done
Starting openbalena_redis_1_a3ba71d90e7b         ... done
Starting openbalena_cert-provider_1_b28198981d70 ... done
Starting openbalena_api_1_67b575c88613           ... done
Starting openbalena_registry_1_e3faba8d4340      ... done
Starting openbalena_vpn_1_f83383c7ec5b           ... done
Starting openbalena_haproxy_1_73e64c768a62       ... done
Attaching to openbalena_db_1_cd9595acbb38, openbalena_redis_1_a3ba71d90e7b, openbalena_s3_1_608b450642cc, openbalena_cert-provider_1_b28198981d70, openbalena_api_1_67b575c88613, openbalena_registry_1_e3faba8d4340, openbalena_vpn_1_f83383c7ec5b, openbalena_haproxy_1_73e64c768a62
db_1_cd9595acbb38 | 2019-05-15 20:22:22.795 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
db_1_cd9595acbb38 | 2019-05-15 20:22:22.795 UTC [1] LOG:  listening on IPv6 address "::", port 5432
db_1_cd9595acbb38 | 2019-05-15 20:22:22.798 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
db_1_cd9595acbb38 | 2019-05-15 20:22:22.997 UTC [22] LOG:  database system was shut down at 2019-05-15 20:21:56 UTC
db_1_cd9595acbb38 | 2019-05-15 20:22:23.618 UTC [1] LOG:  database system is ready to accept connections
redis_1_a3ba71d90e7b | 1:C 15 May 2019 20:22:22.619 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
redis_1_a3ba71d90e7b | 1:C 15 May 2019 20:22:22.619 # Redis version=5.0.4, bits=64, commit=00000000, modified=0, pid=1, just started
redis_1_a3ba71d90e7b | 1:C 15 May 2019 20:22:22.619 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf
redis_1_a3ba71d90e7b | 1:M 15 May 2019 20:22:22.637 * Running mode=standalone, port=6379.
redis_1_a3ba71d90e7b | 1:M 15 May 2019 20:22:22.637 # Server initialized
redis_1_a3ba71d90e7b | 1:M 15 May 2019 20:22:22.692 * DB loaded from disk: 0.054 seconds
redis_1_a3ba71d90e7b | 1:M 15 May 2019 20:22:22.692 * Ready to accept connections
s3_1_608b450642cc | Systemd init system enabled.
cert-provider_1_b28198981d70 | [Error] ACTIVE variable is not enabled. Value should be "true" or "yes" to continue.
cert-provider_1_b28198981d70 | [Error] Unable to continue due to misconfiguration. See errors above. [Stopping]
api_1_67b575c88613 | Systemd init system enabled.
registry_1_e3faba8d4340 | Systemd init system enabled.
vpn_1_f83383c7ec5b | Systemd init system enabled.
haproxy_1_73e64c768a62 | Building certificate from environment variables...
haproxy_1_73e64c768a62 | Setting up watches.  Beware: since -r was given, this may take a while!
haproxy_1_73e64c768a62 | Watches established.
haproxy_1_73e64c768a62 | [NOTICE] 134/202232 (15) : New worker #1 (17) forked
haproxy_1_73e64c768a62 | [WARNING] 134/202232 (17) : Server backend_api/resin_api_1 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
haproxy_1_73e64c768a62 | [ALERT] 134/202232 (17) : backend 'backend_api' has no server available!
haproxy_1_73e64c768a62 | [WARNING] 134/202233 (17) : Server backend_registry/resin_registry_1 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
haproxy_1_73e64c768a62 | [ALERT] 134/202233 (17) : backend 'backend_registry' has no server available!
haproxy_1_73e64c768a62 | [WARNING] 134/202233 (17) : Server backend_vpn/resin_vpn_1 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
haproxy_1_73e64c768a62 | [ALERT] 134/202233 (17) : backend 'backend_vpn' has no server available!
haproxy_1_73e64c768a62 | [WARNING] 134/202234 (17) : Server vpn-tunnel/balena_vpn is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
haproxy_1_73e64c768a62 | [ALERT] 134/202234 (17) : proxy 'vpn-tunnel' has no server available!

And of course, balena login errors out with:

BalenaRequestError: Request error: <html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>

Hi,

Ok, so it looks like the main services are starting but possibly an issue in the API container. Could you try looking at the journalctl logs for the open-balena-api service in the API container? This can be done by executing the journalctl command in the API container, which is aliased for you in the scripts as logs:

logs api -fn100

This will take the last 100 lines and then tail the output. If you could post any errors etc here, that would be great.

Actually, that only applies in the Vagrant VM - you will need to do the following:

./scripts/compose exec api journalctl -u open-balena-api -fn100

Thanks for the note!

open-balena-api doesn’t seem to run at all:

balena@vps:~/open-balena$ ./scripts/compose exec api journalctl -u open-balena-api -fn100
No journal files were found.

Querying for any units also fails:

balena@vps43862673:~/open-balena$ ./scripts/compose exec api journalctl
No journal files were found.
-- No entries --

Seems like journald never actually starts?:

balena@vps:~/open-balena$ ./scripts/compose exec api ps -A
  PID TTY          TIME CMD
    1 ?        00:00:00 systemd
    6 ?        00:00:00 sleep
   14 pts/0    00:00:00 ps

@sinclair something seems to be really broken in your setup. Could you confirm that you have pulled the latest version of the master branch for openBalena, delete the ./config directory, and then run the ./scripts/quickstart ... process again. Please keep a log of everything and post it here; hopefully we can see from that what is happening.

@richbayliss

For the sake of fully redoing the process, I reset the server to standard Ubuntu 18.04 and proceeded from there. Long story short, I’m ending up in exactly the same spot, with the api container just not doing anything. I did run into a couple of bumps along the way, but I don’t see anything in the logs that would indicate an insurmountable problem.

root@vps:~# apt-get update && apt-get install -y build-essential git
root@vps:~# adduser balena
root@vps:~# usermod -aG sudo balena
root@vps:~# apt-get install docker.io
root@vps:~# usermod -aG docker balena
root@vps:~# curl -L https://github.com/docker/compose/releases/download/1.23.0-rc3/docker-compose-Linux-x86_64 -o /usr/local/bin/docker-compose
root@vps:~# chmod +x /usr/local/bin/docker-compose
root@vps:~# apt-get install libssl-dev
root@vps:~# apt-get install nodejs
root@vps:~# apt-get install npm

No issues (output cut for bevity).

root@vps:~# su balena
balena@vps:~$ cd ~/
balena@vps:~$ git clone https://github.com/balena-io/open-balena.git
balena@vps:~$ cd open-balena
balena@vps:~/open-balena$ git log | head -n 5
commit 3bf14a2140ac30f801c877c74a199640649e54b6
Author: Resin CI <34882892+balena-ci@users.noreply.github.com>
Date:   Fri May 10 18:22:12 2019 +0300

    v1.1.1
balena@vps:~/open-balena$ ./scripts/quickstart -U <MYEMAIL> -P <MYPASSWORD> -d <MYDOMAIN>
==> Creating new configuration at: /home/balena/open-balena/config
==> Bootstrapping easy-rsa...
  - Downloading easy-rsa...
==> Generating root CA cert...
==> Generating root cert chain for haproxy...
==> Generating token auth cert...
==> Generating VPN CA, cert and dhparam (this may take a while)...
==> Setting up environment...
==> Adding default compose file...
==> Success!
  - Start the instance with: ./scripts/compose up -d
  - Stop the instance with: ./scripts/compose stop
  - Use the following certificate with Balena CLI: /home/balena/open-balena/config/certs/root/ca.crt

No troubles so far.

balena@vps:~/open-balena$ ./scripts/compose up
Creating network "openbalena_default" with the default driver
Creating volume "openbalena_certs" with default driver
Creating volume "openbalena_cert-provider" with default driver
Creating volume "openbalena_db" with default driver
Creating volume "openbalena_redis" with default driver
Creating volume "openbalena_registry" with default driver
Creating volume "openbalena_s3" with default driver
Pulling redis (redis:alpine)...
alpine: Pulling from library/redis
e7c96db7181b: Pull complete
b8af38234db8: Pull complete
c7b5e96879e0: Pull complete
979331f0d18b: Pull complete
e3d24058052e: Pull complete
b65b87029df3: Pull complete
Pulling db (balena/open-balena-db:v2.0.3)...
v2.0.3: Pulling from balena/open-balena-db
f7e2b70d04ae: Pull complete
027ad848ac9c: Pull complete
7c040ef66643: Pull complete
b891079ad2eb: Pull complete
cb64a97e42d9: Pull complete
1b88625f7d89: Pull complete
a6ac0b663e77: Pull complete
594497f0a694: Pull complete
dad6671cf9e4: Pull complete
031e395c48c5: Pull complete
94bb003366d6: Pull complete
9f5ee56c6727: Pull complete
9f7973e52e7c: Pull complete
d2a24cf07a9a: Pull complete
3ce003e0ceba: Pull complete
Building cert-provider
Step 1/11 : FROM alpine
latest: Pulling from library/alpine
e7c96db7181b: Already exists
Digest: sha256:769fddc7cc2f0a1c35abb2f91432e8beecf83916c421420e6a6da9f8975464b6
Status: Downloaded newer image for alpine:latest
 ---> 055936d39205
Step 2/11 : EXPOSE 80
 ---> Running in 6ef94966b23b
Removing intermediate container 6ef94966b23b
 ---> 3a93b72d8c93
Step 3/11 : WORKDIR /usr/src/app
 ---> Running in 2a91794d8001
Removing intermediate container 2a91794d8001
 ---> 44019fb2f30f
Step 4/11 : VOLUME [ "/usr/src/app/certs" ]
 ---> Running in 8b6f91ed47ba
Removing intermediate container 8b6f91ed47ba
 ---> 49674b319bc7
Step 5/11 : RUN apk add --update bash curl git openssl ncurses socat
 ---> Running in e2df205ec7eb
fetch http://dl-cdn.alpinelinux.org/alpine/v3.9/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.9/community/x86_64/APKINDEX.tar.gz
(1/16) Installing ncurses-terminfo-base (6.1_p20190105-r0)
(2/16) Installing ncurses-terminfo (6.1_p20190105-r0)
(3/16) Installing ncurses-libs (6.1_p20190105-r0)
(4/16) Installing readline (7.0.003-r1)
(5/16) Installing bash (4.4.19-r1)
Executing bash-4.4.19-r1.post-install
(6/16) Installing ca-certificates (20190108-r0)
(7/16) Installing nghttp2-libs (1.35.1-r0)
(8/16) Installing libssh2 (1.8.2-r0)
(9/16) Installing libcurl (7.64.0-r1)
(10/16) Installing curl (7.64.0-r1)
(11/16) Installing expat (2.2.6-r0)
(12/16) Installing pcre2 (10.32-r1)
(13/16) Installing git (2.20.1-r0)
(14/16) Installing ncurses (6.1_p20190105-r0)
(15/16) Installing openssl (1.1.1b-r1)
(16/16) Installing socat (1.7.3.2-r5)
Executing busybox-1.29.3-r10.trigger
Executing ca-certificates-20190108-r0.trigger
OK: 31 MiB in 30 packages
Removing intermediate container e2df205ec7eb
 ---> 0d06ab0d2bc5
Step 6/11 : RUN git clone https://github.com/Neilpang/acme.sh.git &&     cd acme.sh &&     git checkout 08357e3cb0d80c84bdaf3e42ce0e439665387f57 . &&     ./acme.sh --install      --cert-home /usr/src/app/certs
 ---> Running in ab8a34a6b624
Cloning into 'acme.sh'...
[Sun May 19 02:16:25 UTC 2019] Installing to /root/.acme.sh
[Sun May 19 02:16:25 UTC 2019] Installed to /root/.acme.sh/acme.sh
[Sun May 19 02:16:25 UTC 2019] No profile is found, you will need to go into /root/.acme.sh to use acme.sh
[Sun May 19 02:16:26 UTC 2019] Installing cron job
[Sun May 19 02:16:26 UTC 2019] Good, bash is found, so change the shebang to use bash as preferred.
[Sun May 19 02:16:28 UTC 2019] OK
Removing intermediate container ab8a34a6b624
 ---> 88bf17e30b40
Step 7/11 : COPY entry.sh /entry.sh
 ---> 648a56ce8ccc
Step 8/11 : COPY cert-provider.sh ./cert-provider.sh
 ---> 1cbe2b3c2b9e
Step 9/11 : COPY fake-le-bundle.pem ./
 ---> 31196814bb21
Step 10/11 : ENTRYPOINT [ "/entry.sh" ]
 ---> Running in d2ff4f9eee46
Removing intermediate container d2ff4f9eee46
 ---> adbdf9025e60
Step 11/11 : CMD [ "/usr/src/app/cert-provider.sh" ]
 ---> Running in 62dc76c49c7f
Removing intermediate container 62dc76c49c7f
 ---> 6fd7245189c0
Successfully built 6fd7245189c0
Successfully tagged openbalena_cert-provider:latest
WARNING: Image for service cert-provider was built because it did not already exist. To rebuild this image you must use `docker-compose build` or `docker-compose up --build`.
Pulling s3 (balena/open-balena-s3:v2.6.2)...
v2.6.2: Pulling from balena/open-balena-s3
22dbe790f715: Pull complete
417fb282227e: Pull complete
0b592aa7ebd4: Pull complete
b92ebe9e7cc4: Pull complete
aa08e70f73ba: Pull complete
577c6a896d0e: Pull complete
dc3c3434bf41: Pull complete
c5f438b6290f: Pull complete
86840dc98271: Pull complete
5a82baa7b904: Pull complete
7bc36acb5127: Pull complete
9ff553a54788: Pull complete
3ae20e6a3dcb: Pull complete
e1e775bc6b19: Pull complete
2822433bcfd3: Pull complete
e3269537733a: Pull complete
bcce7b5c2b5b: Pull complete
0317148ae6f3: Pull complete
67ec0dd97efb: Pull complete
a2562b82dc10: Pull complete
c01b68bd8038: Pull complete
08d34353a24f: Pull complete
f30cbdac80aa: Pull complete
e9b28f006ab2: Pull complete
Pulling api (balena/open-balena-api:v0.11.8)...
v0.11.8: Pulling from balena/open-balena-api
741437d97401: Pull complete
85537557ba17: Pull complete
803afbe1ea06: Pull complete
4afcb8500079: Pull complete
98bb1bf8a697: Pull complete
b5b32132995a: Pull complete
122c79f0d959: Pull complete
bf3cb4acc7ea: Pull complete
a74d03bf9d43: Pull complete
a95f15685efb: Pull complete
1e57eeb0bada: Pull complete
cb28591468c2: Pull complete
aa26ebf99b85: Pull complete
3d1d6453186a: Pull complete
45f5aec8ac2c: Pull complete
a4db7fce896f: Pull complete
b3233e34b264: Pull complete
bda131880261: Pull complete
a71feb60926f: Pull complete
c6884aca18b1: Pull complete
1094d3c50fac: Pull complete
f4448983199b: Pull complete
c1615c21cb84: Pull complete
2309145d3574: Pull complete
380beb8581b5: Pull complete
Pulling vpn (balena/open-balena-vpn:v8.10.0)...
v8.10.0: Pulling from balena/open-balena-vpn
22dbe790f715: Already exists
417fb282227e: Already exists
0b592aa7ebd4: Already exists
b92ebe9e7cc4: Already exists
aa08e70f73ba: Already exists
577c6a896d0e: Already exists
dc3c3434bf41: Already exists
c5f438b6290f: Already exists
86840dc98271: Already exists
5a82baa7b904: Already exists
7bc36acb5127: Already exists
9ff553a54788: Already exists
3ae20e6a3dcb: Already exists
e1e775bc6b19: Already exists
2822433bcfd3: Already exists
e3269537733a: Already exists
bcce7b5c2b5b: Already exists
0317148ae6f3: Already exists
67ec0dd97efb: Already exists
a2562b82dc10: Already exists
f9ed068ccdb7: Pull complete
13e911de4647: Pull complete
4fed9cc64f16: Pull complete
1e86af3d8a7f: Pull complete
0f33fcf455ad: Pull complete
0abcc1732d7a: Pull complete
3cfb9aecc98e: Pull complete
162b7a3fedf3: Pull complete
f00dcee69ac7: Pull complete
Pulling registry (balena/open-balena-registry:v2.5.0)...
v2.5.0: Pulling from balena/open-balena-registry
22dbe790f715: Already exists
417fb282227e: Already exists
0b592aa7ebd4: Already exists
b92ebe9e7cc4: Already exists
aa08e70f73ba: Already exists
577c6a896d0e: Already exists
dc3c3434bf41: Already exists
c5f438b6290f: Already exists
86840dc98271: Already exists
5a82baa7b904: Already exists
7bc36acb5127: Already exists
9ff553a54788: Already exists
3ae20e6a3dcb: Already exists
e1e775bc6b19: Already exists
2822433bcfd3: Already exists
e3269537733a: Already exists
bcce7b5c2b5b: Already exists
0317148ae6f3: Already exists
67ec0dd97efb: Already exists
a2562b82dc10: Already exists
e22bbbabeef3: Pull complete
60300453844d: Pull complete
0f2a48a0fb1f: Pull complete
95973fc2dbe0: Pull complete
a51b8f8015da: Pull complete
Building haproxy
Step 1/6 : FROM haproxy:1.9-alpine
1.9-alpine: Pulling from library/haproxy
e7c96db7181b: Already exists
4a05bceaa581: Pull complete
230edb02294b: Pull complete
Digest: sha256:0f89766aa82f67060a11fd201ab867aff0667d9bbceff0dc2fd27f88831bdb97
Status: Downloaded newer image for haproxy:1.9-alpine
 ---> 41a07a3b8897
Step 2/6 : VOLUME [ "/certs" ]
 ---> Running in 12c60d2e186f
Removing intermediate container 12c60d2e186f
 ---> 2c104f027bb6
Step 3/6 : RUN apk add --update inotify-tools
 ---> Running in eaf5fad9430a
fetch http://dl-cdn.alpinelinux.org/alpine/v3.9/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.9/community/x86_64/APKINDEX.tar.gz
(1/1) Installing inotify-tools (3.20.1-r1)
Executing busybox-1.29.3-r10.trigger
OK: 6 MiB in 18 packages
Removing intermediate container eaf5fad9430a
 ---> d2e1552f9308
Step 4/6 : COPY haproxy.cfg /usr/local/etc/haproxy/haproxy.cfg
 ---> b3be2ffcca03
Step 5/6 : COPY start-haproxy.sh /start-haproxy
 ---> 4189d85c9617
Step 6/6 : CMD /start-haproxy
 ---> Running in b4c63a1c9db2
Removing intermediate container b4c63a1c9db2
 ---> 365b0146835f
Successfully built 365b0146835f
Successfully tagged openbalena_haproxy:latest
WARNING: Image for service haproxy was built because it did not already exist. To rebuild this image you must use `docker-compose build` or `docker-compose up --build`.
Creating openbalena_db_1_a98686ee9e2b ... 
Creating openbalena_redis_1_80f98d5c21a5 ... 
Creating openbalena_s3_1_74e059c36e0f    ... 
Creating openbalena_cert-provider_1_ae75aa3e02ce ... 

ERROR: for openbalena_cert-provider_1_ae75aa3e02ce  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)

ERROR: for openbalena_db_1_a98686ee9e2b  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)

ERROR: for openbalena_redis_1_80f98d5c21a5  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)

ERROR: for openbalena_s3_1_74e059c36e0f  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)

ERROR: for db  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)

ERROR: for redis  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)

ERROR: for s3  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)

ERROR: for cert-provider  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)
ERROR: An HTTP request took too long to complete. Retry with --verbose to obtain debug information.
If you encounter this issue regularly because of slow network conditions, consider setting COMPOSE_HTTP_TIMEOUT to a higher value (current value: 60).

Strange; what if we run again?:

balena@vps:~/open-balena$ ./scripts/compose up
openbalena_redis_1_951ad77fac5f is up-to-date
openbalena_db_1_e08e5d53d7fc is up-to-date
openbalena_s3_1_d1a8e870ae4c is up-to-date
openbalena_cert-provider_1_c5cb9972fe19 is up-to-date
Creating openbalena_api_1_13408d80ffe7 ... done
Creating openbalena_registry_1_5d9f5a1a7761 ... done
Creating openbalena_vpn_1_32a5e9e40137      ... done
Creating openbalena_haproxy_1_30df91cbccd2  ... error

ERROR: for openbalena_haproxy_1_30df91cbccd2  Cannot start service haproxy: driver failed programming external connectivity on endpoint openbalena_haproxy_1_60f2415d3af8 (cc5422a802aca49ffa13f84a398d0e86b4cffd6d17f6edfddbaaca3d5aeb6a18): Error starting userland proxy: listen tcp 0.0.0.0:80: bind: address already in use

ERROR: for haproxy  Cannot start service haproxy: driver failed programming external connectivity on endpoint openbalena_haproxy_1_60f2415d3af8 (cc5422a802aca49ffa13f84a398d0e86b4cffd6d17f6edfddbaaca3d5aeb6a18): Error starting userland proxy: listen tcp 0.0.0.0:80: bind: address already in use
ERROR: Encountered errors while bringing up the project.

Oh, whoops, forgot to stop apache:

balena@vps:~/open-balena$ sudo systemctl stop apache2
[sudo] password for balena: 
balena@vps:~/open-balena$ ./scripts/compose up
openbalena_db_1_e08e5d53d7fc is up-to-date
openbalena_redis_1_951ad77fac5f is up-to-date
openbalena_cert-provider_1_c5cb9972fe19 is up-to-date
openbalena_s3_1_d1a8e870ae4c is up-to-date
openbalena_api_1_4cad9ff00e48 is up-to-date
openbalena_vpn_1_eb1ff70cce31 is up-to-date
openbalena_registry_1_ceffb5c7032d is up-to-date
Starting openbalena_haproxy_1_60f2415d3af8 ... done
Attaching to openbalena_db_1_e08e5d53d7fc, openbalena_redis_1_951ad77fac5f, openbalena_cert-provider_1_c5cb9972fe19, openbalena_s3_1_d1a8e870ae4c, openbalena_api_1_4cad9ff00e48, openbalena_vpn_1_eb1ff70cce31, openbalena_registry_1_ceffb5c7032d, openbalena_haproxy_1_60f2415d3af8
db_1_e08e5d53d7fc | The files belonging to this database system will be owned by user "postgres".
db_1_e08e5d53d7fc | This user must also own the server process.
db_1_e08e5d53d7fc | 
db_1_e08e5d53d7fc | The database cluster will be initialized with locale "en_US.utf8".
db_1_e08e5d53d7fc | The default database encoding has accordingly been set to "UTF8".
db_1_e08e5d53d7fc | The default text search configuration will be set to "english".
db_1_e08e5d53d7fc | 
db_1_e08e5d53d7fc | Data page checksums are disabled.
db_1_e08e5d53d7fc | 
db_1_e08e5d53d7fc | fixing permissions on existing directory /var/lib/postgresql/data ... ok
db_1_e08e5d53d7fc | creating subdirectories ... ok
db_1_e08e5d53d7fc | selecting default max_connections ... 100
db_1_e08e5d53d7fc | selecting default shared_buffers ... 128MB
db_1_e08e5d53d7fc | selecting dynamic shared memory implementation ... posix
db_1_e08e5d53d7fc | creating configuration files ... ok
db_1_e08e5d53d7fc | running bootstrap script ... ok
db_1_e08e5d53d7fc | performing post-bootstrap initialization ... ok
db_1_e08e5d53d7fc | 
db_1_e08e5d53d7fc | WARNING: enabling "trust" authentication for local connections
db_1_e08e5d53d7fc | You can change this by editing pg_hba.conf or using the option -A, or
db_1_e08e5d53d7fc | --auth-local and --auth-host, the next time you run initdb.
db_1_e08e5d53d7fc | syncing data to disk ... ok
db_1_e08e5d53d7fc | 
db_1_e08e5d53d7fc | Success. You can now start the database server using:
db_1_e08e5d53d7fc | 
db_1_e08e5d53d7fc |     pg_ctl -D /var/lib/postgresql/data -l logfile start
db_1_e08e5d53d7fc | 
db_1_e08e5d53d7fc | waiting for server to start....2019-05-19 02:31:36.302 UTC [41] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
db_1_e08e5d53d7fc | 2019-05-19 02:31:36.399 UTC [42] LOG:  database system was shut down at 2019-05-19 02:31:35 UTC
db_1_e08e5d53d7fc | 2019-05-19 02:31:36.633 UTC [41] LOG:  database system is ready to accept connections
db_1_e08e5d53d7fc |  done
db_1_e08e5d53d7fc | server started
db_1_e08e5d53d7fc | CREATE DATABASE
db_1_e08e5d53d7fc | 
db_1_e08e5d53d7fc | 
db_1_e08e5d53d7fc | /usr/local/bin/docker-entrypoint.sh: running /docker-entrypoint-initdb.d/create-resin-db.sh
db_1_e08e5d53d7fc | CREATE DATABASE
db_1_e08e5d53d7fc | 
db_1_e08e5d53d7fc | 2019-05-19 02:31:38.273 UTC [41] LOG:  received fast shutdown request
db_1_e08e5d53d7fc | waiting for server to shut down....2019-05-19 02:31:38.274 UTC [41] LOG:  aborting any active transactions
db_1_e08e5d53d7fc | 2019-05-19 02:31:38.281 UTC [41] LOG:  worker process: logical replication launcher (PID 48) exited with exit code 1
db_1_e08e5d53d7fc | 2019-05-19 02:31:38.281 UTC [43] LOG:  shutting down
db_1_e08e5d53d7fc | 2019-05-19 02:31:38.299 UTC [41] LOG:  database system is shut down
db_1_e08e5d53d7fc |  done
db_1_e08e5d53d7fc | server stopped
db_1_e08e5d53d7fc | 
db_1_e08e5d53d7fc | PostgreSQL init process complete; ready for start up.
db_1_e08e5d53d7fc | 
db_1_e08e5d53d7fc | 2019-05-19 02:31:38.450 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
db_1_e08e5d53d7fc | 2019-05-19 02:31:38.450 UTC [1] LOG:  listening on IPv6 address "::", port 5432
db_1_e08e5d53d7fc | 2019-05-19 02:31:38.451 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
db_1_e08e5d53d7fc | 2019-05-19 02:31:38.535 UTC [69] LOG:  database system was shut down at 2019-05-19 02:31:38 UTC
db_1_e08e5d53d7fc | 2019-05-19 02:31:38.814 UTC [1] LOG:  database system is ready to accept connections
haproxy_1_60f2415d3af8 | Building certificate from environment variables...
redis_1_951ad77fac5f | 1:C 19 May 2019 02:31:32.124 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
redis_1_951ad77fac5f | 1:C 19 May 2019 02:31:32.125 # Redis version=5.0.5, bits=64, commit=00000000, modified=0, pid=1, just started
redis_1_951ad77fac5f | 1:C 19 May 2019 02:31:32.125 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf
redis_1_951ad77fac5f | 1:M 19 May 2019 02:31:32.202 * Running mode=standalone, port=6379.
redis_1_951ad77fac5f | 1:M 19 May 2019 02:31:32.202 # Server initialized
redis_1_951ad77fac5f | 1:M 19 May 2019 02:31:32.203 * Ready to accept connections
cert-provider_1_c5cb9972fe19 | [Error] ACTIVE variable is not enabled. Value should be "true" or "yes" to continue.
cert-provider_1_c5cb9972fe19 | [Error] Unable to continue due to misconfiguration. See errors above. [Stopping]
s3_1_d1a8e870ae4c | Systemd init system enabled.
registry_1_ceffb5c7032d | Systemd init system enabled.
api_1_4cad9ff00e48 | Systemd init system enabled.
haproxy_1_60f2415d3af8 | Setting up watches.  Beware: since -r was given, this may take a while!
haproxy_1_60f2415d3af8 | Watches established.
vpn_1_eb1ff70cce31 | Systemd init system enabled.
haproxy_1_60f2415d3af8 | [NOTICE] 138/024231 (15) : New worker #1 (17) forked
haproxy_1_60f2415d3af8 | [WARNING] 138/024231 (17) : Server backend_api/resin_api_1 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
haproxy_1_60f2415d3af8 | [ALERT] 138/024231 (17) : backend 'backend_api' has no server available!
haproxy_1_60f2415d3af8 | [WARNING] 138/024231 (17) : Server backend_registry/resin_registry_1 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
haproxy_1_60f2415d3af8 | [ALERT] 138/024231 (17) : backend 'backend_registry' has no server available!
haproxy_1_60f2415d3af8 | [WARNING] 138/024232 (17) : Server backend_vpn/resin_vpn_1 is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
haproxy_1_60f2415d3af8 | [ALERT] 138/024232 (17) : backend 'backend_vpn' has no server available!
haproxy_1_60f2415d3af8 | [WARNING] 138/024233 (17) : Server vpn-tunnel/balena_vpn is DOWN, reason: Layer4 connection problem, info: "Connection refused", check duration: 0ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
haproxy_1_60f2415d3af8 | [ALERT] 138/024233 (17) : proxy 'vpn-tunnel' has no server available!

Back where we started. api continues to do nothing:

balena@vps:~/open-balena$ ./scripts/compose exec api journalctl
No journal files were found.
-- No entries --
balena@vps:~/open-balena$ ./scripts/compose exec api ps -A
  PID TTY          TIME CMD
    1 ?        00:00:00 systemd
    7 ?        00:00:00 sleep
   14 pts/0    00:00:00 ps

That is very weird. I have just this morning pulled down the latest version and it worked exactly right. I used the Vagrant VM which we ship a Vagrantfile for in the repo (https://github.com/balena-io/open-balena/blob/master/Vagrantfile) which is also based on Ubuntu 18.04 so that matches your setup.

You mentioned having to stop Apache; I am wondering if there is something on your install which is preventing a service from starting properly but we are unable to see the cause. Without having the openBalena stack running, could you see if anything is running on port 80, 443, 3128, 5432 and 6379.

Looks like just ssh and exim, and nothing specifically on any of those ports.

balena@vps:~$ sudo netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      669/sshd            
tcp        0      0 127.0.0.1:25            0.0.0.0:*               LISTEN      930/exim4           
tcp6       0      0 :::22                   :::*                    LISTEN      669/sshd            
tcp6       0      0 ::1:25                  :::*                    LISTEN      930/exim4           

Just to make sure, netstat with openBalena running:

balena@vps:~$ sudo netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      669/sshd            
tcp        0      0 127.0.0.1:25            0.0.0.0:*               LISTEN      930/exim4           
tcp6       0      0 :::80                   :::*                    LISTEN      2260/docker-proxy   
tcp6       0      0 :::22                   :::*                    LISTEN      669/sshd            
tcp6       0      0 :::3128                 :::*                    LISTEN      2205/docker-proxy   
tcp6       0      0 ::1:25                  :::*                    LISTEN      930/exim4           
tcp6       0      0 :::443                  :::*                    LISTEN      2236/docker-proxy

@sinclair just an FYI that we have updated openBalena to v.1.2.0 which has a new version of the registry and API container services. You could try updating via git pull and then restarting the new versions with ./scripts/compose up -d

It would be great to hear if this has any effect, and if it changes if you can get the journalctl logs :+1:

Just tried out the new 1.2.0 on a clean Ubuntu 18.04 install, and no dice. ./scripts/compose up still shows the same haproxy error, with ./scripts/compose exec api journalctl showing nothing.