Stopped containers keep restarting

When one of my devices loads, a container uses the Supervisor API to stop another container. The container stops running as expected, but after a while it starts again without any instruction. Presumably the Supervisor is restarting it, but I want it to remain stopped:

22.06.21 12:37:54 (-0700) Killing service 'portainer sha256:14164aa7aca23f481aade952e011450ee6d5164f0d5f4cbe5b85f72ac970820c'
22.06.21 12:37:55 (-0700)  controller  NoneTypestop - OK200
22.06.21 12:37:57 (-0700) Service exited 'portainer sha256:14164aa7aca23f481aade952e011450ee6d5164f0d5f4cbe5b85f72ac970820c'
22.06.21 12:37:57 (-0700) Killed service 'portainer sha256:14164aa7aca23f481aade952e011450ee6d5164f0d5f4cbe5b85f72ac970820c'
22.06.21 13:36:11 (-0700) Starting service 'portainer sha256:14164aa7aca23f481aade952e011450ee6d5164f0d5f4cbe5b85f72ac970820c'
22.06.21 13:36:15 (-0700) Started service 'portainer sha256:14164aa7aca23f481aade952e011450ee6d5164f0d5f4cbe5b85f72ac970820c'

It seems to happen at different intervals:

22.06.21 13:42:57 (-0700)  controller  containerstop - OK200
22.06.21 13:42:57 (-0700)  controller - - [22/Jun/2021 20:42:57] "GET /v1/portainer/stop HTTP/1.1" 200 -
22.06.21 13:43:00 (-0700) Killed service 'portainer sha256:14164aa7aca23f481aade952e011450ee6d5164f0d5f4cbe5b85f72ac970820c'
22.06.21 13:43:00 (-0700) Service exited 'portainer sha256:14164aa7aca23f481aade952e011450ee6d5164f0d5f4cbe5b85f72ac970820c'
22.06.21 13:52:00 (-0700) Starting service 'portainer sha256:14164aa7aca23f481aade952e011450ee6d5164f0d5f4cbe5b85f72ac970820c'
22.06.21 13:52:04 (-0700) Started service 'portainer sha256:14164aa7aca23f481aade952e011450ee6d5164f0d5f4cbe5b85f72ac970820c'
22.06.21 13:52:04 (-0700) Restarting service 'portainer sha256:14164aa7aca23f481aade952e011450ee6d5164f0d5f4cbe5b85f72ac970820c'

The container I am stopping has the restart: on-failure line in the docker-compose file.

The device is connected to Balena Cloud.

I can’t work out why it wants to keep starting it. At least one reason appears to be related to pushing updates (although perhaps not the reason above as the logs look different). I did a new deploy. with no updates or changes to the stopped container, and the Supervisor appears to see it as a reason to start the container it earlier was asked to stop. The container is still shown as Created: 2 hours ago so it doesn’t seem to have rebuilt it, just started it:

2.06.21 13:54:16 (-0700) Killing service 'portainer sha256:14164aa7aca23f481aade952e011450ee6d5164f0d5f4cbe5b85f72ac970820c'
22.06.21 13:54:19 (-0700) Service exited 'portainer sha256:14164aa7aca23f481aade952e011450ee6d5164f0d5f4cbe5b85f72ac970820c'
22.06.21 13:54:19 (-0700) Killed service 'portainer sha256:14164aa7aca23f481aade952e011450ee6d5164f0d5f4cbe5b85f72ac970820c'
22.06.21 14:46:58 (-0700) Downloading delta for image ''
22.06.21 14:46:58 (-0700) Downloading delta for image ''
22.06.21 14:47:21 (-0700) Delta still processing remotely. Will retry...
22.06.21 14:47:21 (-0700) Delta still processing remotely. Will retry...
22.06.21 14:47:22 (-0700) Starting service 'portainer sha256:14164aa7aca23f481aade952e011450ee6d5164f0d5f4cbe5b85f72ac970820c'
22.06.21 14:47:22 (-0700) Downloading delta for image ''
22.06.21 14:47:22 (-0700) Downloading delta for image ''
22.06.21 14:47:26 (-0700) Started service 'portainer sha256:14164aa7aca23f481aade952e011450ee6d5164f0d5f4cbe5b85f72ac970820c'

Perhaps Supervisor isn’t observing the restart: on-failure string like Docker does?

balenaOS 2.60.1+rev1


A naughty bump to make sure this doesn’t get lost.

I had a look at the docs and they indicate that this stop is only temporary.

Temporarily stops a user application container. A reboot or supervisor restart will cause the container to start again.

Could you explain why you want to permanently stop the container? Maybe there is a better way to reach your goal.

#pendinguserresponse #status user wants to locally stop a container and keep it stopped. Reason unclear so far

I am running an instance of Portainer but do not need it all the time so I stop it to free resources and to prevent user access.

Why would the container stop only be temporary? I am struggling to see why the behaviour to stop until a reboot or supervisor restart would make sense over observing the restart policy in the docker-compose file.

Are you looking at the single container stop command? Mine is a multi container app so using POST /v2/applications/:appId/stop-service which doesn’t mention anything about restarting after a reboot or supervisor restart.

Mine is a multi container app so using POST /v2/applications/:appId/stop-service which doesn’t mention anything about restarting after a reboot or supervisor restart.

You’re right, the docs should mention this.

Why would the container stop only be temporary? I am struggling to see why the behaviour to stop until a reboot or supervisor restart would make sense over observing the restart policy in the docker-compose file.

Balena doesn’t store a “stopped” state for the container, it just tells the container to stop. But this does not adjust the target supervisor state, so when the supervisor comes back online, it reapplies the target state.

I do not believe the functionality of stopping a container from within another container without adjusting the supervisor state exists inside Balena. I don’t think this is a particularly common use case, so it would probably not be something we would support unless there was no other way to mimic the functionality you want.

I would recommend creating an endpoint to put the container into an idle state and calling that from your other container. If you need more guidance on how to do this, we would be happy to help.

I assume each time you refer to ‘Balena’ you are referring to the Supervisor not the OS or Engine?

I’m not really sure what you mean here, or what ‘idle state’ is.

I’m surprised this isn’t seen as a bug or unexpected behaviour. The whole Balena Engine is built on Docker technology and the restart policies are such a fundamental component, I had expected the behaviour to mimic Docker’s.

The goal is to stop a container while also adjusting the Supervisor state. I think you understood what I meant but you suggested ‘without’ adjusting the Supervisor, so just checking.

Assuming that is correct, I think the uncommon use case is wanting to stop a container and have it restart again at will ignoring any instruction from the docker-compose flags, not the other way around. If there is going to be a /stop API I would suspect it is better to observe the restart policy passed to it. I can’t think of a use case where I could benefit from the /stop API if I can’t specify when it starts again (certainly since multi-container support was introduced).

I can see your point that this is counterintuitive to the way docker does things. I can ping the supervisor team and see if they have any thoughts. Generally the supervisor target state comes from the Balena API and isn’t meant to be adjusted by users though. Containers aren’t really meant to be stopped unless you’re stopping them for your whole fleet.

Maybe the supervisor team can provide more context though.

Hey, I work on the Supervisor and this functionality has been known and discussed see Evaluate container's restart policy and exit status before starting · Issue #1668 · balena-os/balena-supervisor · GitHub.

The Supervisor (on device agent that coordinates what services are running) listens to what the cloud (API) wants the device to be running. Today, the Supervisor will start your services when trying to reach the target state and then never touch it again as the engine will take over with regards to the restart policy. The Supervisor knows not to do anything if it notices the container stops because it stores into memory which containers it has started. If you restart the Supervisor this clears that memory of starting the container and so it (Supervisor) thinks it must match the target state. Today, there is no way to specify in the target state that a container should not be running.

We are working on fixing this issue and have already discussed solutions. At this time checking the containers restart policy isn’t the route we think we’ll go with because we’re exploring another way that will provide more functionality. Check out the github issue for more details.

I’ve also PRed the docs to clarify that the v2 endpoints are temporary: Clarify that /v2/applications/:appId/stop-service is temporary by 20k-ultra · Pull Request #1741 · balena-os/balena-supervisor · GitHub

Hey, I’m following up on this ticket because the latest version of the Supervisor v13.1.3, has just been released and contains changes to prevent starting a stopped container. You can upgrade to this version using self-service upgrades[1] in the dashboard.

To clarify the issue, the Supervisor would start any stopped containers on start because the mechanism for tracking which containers have started was stored in memory. Therefore, if the Supervisor restarted than it would see a stopped container which it thought it has not started before. We have corrected the behaviour so once a container is started the restart policy which the engine manages will take over.

If you stop the service in the dashboard, these changes do not persist after reboots if you do not set the appropriate restart policy. If you stop your container from the dashboard and do not specify a policy in your docker-compose than the default restart: always policy is used and the engine will start them again on boot or if the engine restarts.

See docker docs for possible restart policies[1] and how[2] to add them to your release.

[1] Self-service Supervisor Upgrades - Balena Documentation
[2] Compose file version 2 reference | Docker Documentation
[3] Compose file version 3 reference | Docker Documentation