Docker-compose "ipc" option works fine in local mode but not when app is deployed

Hi,

I have some issues with the deployment of one of my apps.
I recently added GPS synchronisation to chrony on my application and it requires shared memory to work (which is allowed with the ipc=“host” option on Docker)

So basically I have a balena app, with gpsd running inside. I checked that gpsd is receiving gps data correctly from my gps module. On the host system (a x86 board running balenaOS) I remounted the root partition in rw mode, and change my chronyd config.

The sync works ok when I’m in local mode, (by using balena push x.x.x.x) and I can see that chrony can communicate with gpsd through its shared memory (“chronyc sources” returns that the GPS source have been selected ).
I then decided to deploy my app to my balena cloud instance.
I build the image locally using “balena build …” (on my computer, as its a x86 machine) and then deployed it “balena deploy …” .
Everything went well (as usual, I use Balena since a few month and it works like a charm) and I removed local mode from my device.
The device downloaded the new app release, installed it correctly and launched it without any error messages.
I still can see that gpsd is getting good information from my gps module. but now chrony can’t sync with it anymore.

That’s why I’m guessing it’s a shared memory issue, and maybe the ‘ipc:host’ setting on my docker-compose file was not taken into consideration…

Would you have any insight regarding my problem ?

Thanks,
I love your work !
Jim.

Hey @jlucidar

The supervisor shouldn’t care whether the target state comes from the cloud or from a balena push, so it’s interesting you’re seeing a difference between them.

My first thought would be to try building the image on our cloud, but that shouldn’t really make a difference either. Did you have the chrony changes present on both pushes?

Could you link to the release and enable support access please?

1 Like

The chrony changes were made directly on the host device (I mounted the system in rw mode) so it was the same config in both case (and it was acting normal).

I made some further investigation concerning this issue, I retrieve the target state locally through the supervisor API and here is what I got :

{
"status": "success",
"state": {
	"local": {
		"name": "white-voice",
		"config": {
			"SUPERVISOR_LOCAL_MODE": "1",
			"SUPERVISOR_POLL_INTERVAL": "600000",
			"SUPERVISOR_VPN_CONTROL": "true",
			"SUPERVISOR_CONNECTIVITY_CHECK": "true",
			"SUPERVISOR_LOG_CONTROL": "true",
			"SUPERVISOR_DELTA": "false",
			"SUPERVISOR_DELTA_REQUEST_TIMEOUT": "30000",
			"SUPERVISOR_DELTA_APPLY_TIMEOUT": "0",
			"SUPERVISOR_DELTA_RETRY_COUNT": "30",
			"SUPERVISOR_DELTA_RETRY_INTERVAL": "10000",
			"SUPERVISOR_DELTA_VERSION": "2",
			"SUPERVISOR_OVERRIDE_LOCK": "false",
			"SUPERVISOR_PERSISTENT_LOGGING": "false"
		},
		"apps": {
			"1": {
				"appId": 1,
				"name": "localapp",
				"commit": "localrelease",
				"releaseId": 1,
				"services": [{
					"imageId": 1,
					"serviceName": "my_app",
					"appId": 1,
					"releaseId": 1,
					"serviceId": 1,
					"imageName": "local_image_my_app:latest",
					"dependsOn": null,
					"config": {
						"environment": {
							"BALENA_APP_ID": "1",
							"BALENA_APP_NAME": "localapp",
							"BALENA_SERVICE_NAME": "my_app",
							"BALENA_DEVICE_UUID": "my_uuid",
							"BALENA_DEVICE_TYPE": "intel-nuc",
							"BALENA_HOST_OS_VERSION": "balenaOS 2.31.2+rev1",
							"BALENA_SUPERVISOR_VERSION": "9.11.1",
							"BALENA_APP_LOCK_PATH": "/tmp/balena/updates.lock",
							"BALENA": "1",
							"RESIN_APP_ID": "1",
							"RESIN_APP_NAME": "localapp",
							"RESIN_SERVICE_NAME": "my_app",
							"RESIN_DEVICE_UUID": "my_uuid",
							"RESIN_DEVICE_TYPE": "intel-nuc",
							"RESIN_HOST_OS_VERSION": "balenaOS 2.31.2+rev1",
							"RESIN_SUPERVISOR_VERSION": "9.11.1",
							"RESIN_APP_LOCK_PATH": "/tmp/balena/updates.lock",
							"RESIN": "1",
							"RESIN_SERVICE_KILL_ME_PATH": "/tmp/balena/handover-complete",
							"BALENA_SERVICE_HANDOVER_COMPLETE_PATH": "/tmp/balena/handover-complete",
							"USER": "root",
							"PATH": "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
							"UDEV": "1",
							"NODE_VERSION": "8.16.0",
							"YARN_VERSION": "1.15.2",
							"WORKING_DIR": "/company/app",
							"RESIN_API_KEY": "secret",
							"BALENA_API_KEY": "secret",
							"RESIN_API_URL": "https://api.my.domain",
							"BALENA_API_URL": "https://api.my.domain",
							"RESIN_SUPERVISOR_PORT": "48484",
							"BALENA_SUPERVISOR_PORT": "48484",
							"RESIN_SUPERVISOR_API_KEY": "secret",
							"BALENA_SUPERVISOR_API_KEY": "secret",
							"RESIN_SUPERVISOR_HOST": "127.0.0.1",
							"BALENA_SUPERVISOR_HOST": "127.0.0.1",
							"RESIN_SUPERVISOR_ADDRESS": "http://127.0.0.1:48484",
							"BALENA_SUPERVISOR_ADDRESS": "http://127.0.0.1:48484"
						},
						"labels": {
							"io.balena.features.kernel-modules": "1",
							"io.balena.features.firmware": "1",
							"io.balena.features.dbus": "1",
							"io.balena.features.supervisor-api": "1",
							"io.balena.features.balena-api": "1",
							"io.balena.local.image": "1",
							"io.balena.local.service": "my_app",
							"io.balena.supervised": "true",
							"io.balena.app-id": "1",
							"io.balena.service-id": "1",
							"io.balena.service-name": "my_app",
							"io.balena.architecture": "amd64"
						},
						"privileged": true,
						"tty": true,
						"ipc": "host",
						"restart": "always",
						"networkMode": "host",
						"volumes": ["1_resin-data:/data", "/tmp/balena-supervisor/services/1/my_app:/tmp/resin", "/tmp/balena-supervisor/services/1/my_app:/tmp/balena", "/run/dbus:/host/run/dbus", "/lib/modules:/lib/modules", "/lib/firmware:/lib/firmware"],
						"image": "sha256:c151e17d0fb39f8c7c5e78dd1ec31bb289496324728333075effe4c929677a44",
						"running": true,
						"hostname": "d034791",
						"command": ["npm", "start"],
						"entrypoint": ["/usr/bin/entry.sh"],
						"stopSignal": "SIGTERM",
						"workingDir": "/company/app",
						"user": "",
						"oomKillDisable": false,
						"readOnly": false,
						"sysctls": {},
						"portMaps": [{
							"ports": {
								"internalStart": 2947,
								"internalEnd": 2947,
								"externalStart": 2947,
								"externalEnd": 2947,
								"host": "",
								"protocol": "tcp"
							}
						}],
						"capAdd": [],
						"capDrop": [],
						"cgroupParent": "",
						"devices": [],
						"dnsOpt": [],
						"extraHosts": [],
						"expose": ["2947/tcp"],
						"networks": {},
						"dns": [],
						"dnsSearch": [],
						"ulimits": {},
						"groupAdd": [],
						"healthcheck": {
							"test": ["NONE"]
						},
						"pid": "",
						"pidsLimit": 0,
						"securityOpt": [],
						"stopGracePeriod": 10,
						"tmpfs": [],
						"usernsMode": "",
						"cpuShares": 0,
						"cpuQuota": 0,
						"cpus": 0,
						"cpuset": "",
						"domainname": "",
						"macAddress": "",
						"memLimit": 0,
						"memReservation": 0,
						"oomScoreAdj": 0,
						"shmSize": 67108864
					}
				}],
				"networks": {},
				"volumes": {
					"resin-data": {
						"labels": {}
					}
				}
			}
		}
	},
	"dependent": {
		"apps": [],
		"devices": []
	}
}
} 

I then disable local mode on the same device and it downloaded my app latest release.
When I requested the balena API about it here is what I found :

curl -X GET "https://api.my.domain/resin/release?\$filter=belongs_to__application%20eq%2<ID>" -H "Content-Type: application/json" -H "Authorization: Bearer <Token>"

result :

{
"d": [{
	"created_at": "2019-07-03T08:52:21.603Z",
	"id": 57,
	"belongs_to__application": {
		"__deferred": {
			"uri": "/resin/application(4)"
		},
		"__id": 4
	},
	"commit": "0055afcb8f3bd0e53d3bd071b0006108",
	"composition": {
		"version": "2.1",
		"networks": {},
		"volumes": {
			"resin-data": {}
		},
		"services": {
			"main": {
				"image": "my_app:647be9ff83c5552cd85208ee4bef210097a83fd8_my_app",
				"privileged": true,
				"tty": true,
				"restart": "always",
				"network_mode": "host",
				"volumes": ["resin-data:/data"],
				"labels": {
					"io.resin.features.kernel-modules": "1",
					"io.resin.features.firmware": "1",
					"io.resin.features.dbus": "1",
					"io.resin.features.supervisor-api": "1",
					"io.resin.features.resin-api": "1"
				}
			}
		}
	},
	"status": "success",
	"source": "local",
	"build_log": null,
	"start_timestamp": "2019-07-03T08:52:21.496Z",
	"end_timestamp": "2019-07-03T08:52:28.048Z",
	"update_timestamp": "2019-07-03T08:52:28.151Z",
	"__metadata": {
		"uri": "/resin/release(57)",
		"type": ""
	}
}]
}

So it seems that the ipc:host was not taken into consideration in my docker-compose.yml . Maybe its a limitation of the API when you declare the composition of an app ?

Don’t know how to enable the support access as I’m running my own open-balena server.

Here is the docker-compose.yml file I use in case you need it :

version: '2.1'
networks: {}
volumes:
  resin-data: {}
services:
  my_app:
    build: "."
    privileged: true
    tty: true
    ports:
      - 2947:2947
    ipc: host
    restart: always
    network_mode: host
    volumes:
      - resin-data:/data
    labels:
      io.resin.features.kernel-modules: 1
      io.resin.features.firmware: 1
      io.resin.features.dbus: 1
      io.resin.features.supervisor-api: 1
      io.resin.features.resin-api: 1

That API query you posted isn’t necessarily the release that is supposed to be running on device, unless there’s only one release for this application on your openBalena instance?
I see you request all applications, ignore this.

I’ll take a look at the code used in balena deploy, to ensure that it doesn’t differ from the code we’re using on balena push.

I tested balena deploy on balenaCloud using your docker-compose file, and ipc: host is definitely uploaded as part of the release. Given that the code is not any different based on whether we’re pushing to an openBalena instance or the cloud, the problem doesn’t seem to be because of balena deploy.

I’ll ask the openBalena maintainers if they know why this would be different.

Boom. found a work-around \o/

Actually, when I’m deploying a new release I do it in two steps :

  • First I build the image using this command :

    balena build -n my_app:<git_hash> -t my_app:<git_hash> -a my_app --logs
    
  • Then I deploy it to my open-balena instance using this command :

    balena deploy my_app my_app:<git_hash>_my_app
    

I activated the debug flag (DEBUG=1) on both commands and I realized that the build command was detecting correctly the docker-compose but the deploy command didn’t.

Build :
[Debug]   Parsing input...
[Debug]   Loading project...
[Debug]   Resolving project...
[Info]    Compose file detected
[Debug]   Creating project...
[Info]    Building for amd64/intel-nuc
[Build]   Building services...
[Build]   my_app Preparing...
[Debug]   Found build tasks:
...

Deploy :
[Debug]   Parsing input...
[Debug]   Loading project...
[Info]    Creating default composition with image: my_app:647be9ff83c5552cd85208ee4bef210097a83fd8_my_app
[Debug]   Creating project...
[Info]    Everything is up to date (use --build to force a rebuild)
[Info]    Creating release...
[Debug]   Tagging images...
[Debug]   Authorizing push...
[Debug]   Requesting access to previously pushed image repo (v2/39d84001031589a949043329271138f4)
[Info]    Pushing images to registry...
...

The workaround is to force rebuild on deploy, but as you can guess it’s not an ideal solution. We are using a CICD framework (Jenkins) and I would like to separate both operation. Let me know if this is possible or if it’s a bug that need to be fixed.

Thanks for your help !
Jim.

Ah ok, that makes a lot of sense. Basically balena deploy doesn’t consider a docker-compose.yml file in the directory. This actually does raise a problem though, because with the current setup, there’d be no way of knowing which image is for which service. I believe this needs to be looked into, as it doesn’t quite make sense from a UX perspective.

The only recommendation I could make for now would be to first run balena build to ensure it builds correctly, and then run balena deploy --build (optionally with --nocache if for some reason you wanted the build to be fresh).

I’ve created this issue to track the problem https://github.com/balena-io/balena-cli/issues/1338 but it’s not clear how it can be solved with the current architecture. We’d probably have to make the balena deploy invocation much more verbose, so I’m not sure when that’d happen.

Ok, I’ll do as you said.
It’s not the most elegant solution, but should do the job for our application.

Thanks again !