RAM/Memory Management

TL;DR How do I manage container memory in Resin/Balena?

What sort of memory management features can I tap into for Resin to ensure a container doesn’t eat up the memory and crash a device? I have an application that increases in RAM usage the longer it goes, and I’d like the ability to set a RAM usage cap for the supervisor to kill and restart the container after it gets to a certain point. Fixing the application isn’t an option, as the RAM usage isn’t a leak but rather an expected mode of operating - it loads logs into memory for viewing and clears them on a restart.

I’ve tried the following:

  • Docker Healthcheck feature in the Dockerfile so the container knows when to restart itself, but I’m not sure Balena watches that feature. See an example HEALTHCHECK in the Dockerfile at https://github.com/realeyes-media/mitmproxy-rpi3/blob/master/Dockerfile.
  • I would set the memory maximum at the Docker-compose level, but the application needs to know when to exit before it crashes from lack of memory.

So far the container runs and increases in memory until Balena can’t address memory any more, which crashes Balena and takes the device out of operation for 5 min or so while it recovers. Is there something I can use to signal Balena to restart a container at a certain level of RAM usage?

I’ve tried implementing the healthcheck flag in the service, and it’s getting rejected with a git hook:

[Error] Could not parse compose file
[Error] data/services/mitmweb should NOT have additional properties
[Error] Not deploying release.

Here’s how the compose file looks (simplified to what we’re talking about):

version: '2'
    image: quay.io/realeyes/mitmproxy-rpi3
      test: ["CMD-SHELL", "if [ $(free -m | grep Mem: | awk '{print $3}') -le ${max_mem_in_kb} ]; then exit 0; else exit 1; fi"]
      start_period: 40s

Hey Marcymarcy,
I’ll take a look into it, one work around might be to define the HEALTHCHECK in the Dockerfile rather than in the docker-compose.yml

Okay, so we looked a bit more into this and it appears that the start_period field is not supported in docker-compose 2.1 as discussed in this issue: https://github.com/docker/compose/issues/5177 , so I think if you removed that field it would work.

Thanks @shaunmulligan - I actually already have the HEALTHCHECK in the Dockerfile, it just didn’t respect it before so I was trying it in compose now. Here’s the pertinent Dockerfile (from https://github.com/realeyes-media/mitmproxy-rpi3/blob/master/Dockerfile) bits:

FROM resin/raspberrypi3-alpine:3.7
ENV max_mem_in_kb="550000"
HEALTHCHECK --interval=5s --timeout=5s --retries=3 \
    CMD if [ $(free -m | grep Mem: | awk '{print $3}') -le ${max_mem_in_kb} ]; then exit 0; else exit 1; fi

I can also give removing the start_period as well, but if the above should work than something else is amiss.

Hey @marcymarcy, Did you get to the bottom of this, what version of resinOS were you testing this against?

hey @marcymarcy I gave this a try on the latest resinOS and I think the reason it wasn’t triggering is because of the -m option in free, which would return your memory in megabytes rather than kilobytes, so 550000 was always larger than say 600 megabytes. I removed the -m and i can see my container being restarted over and over now.

Oh, of course - I’ll give that a shot. Thanks for the catch!

The healthcheck was useful - now our container restarts instead of restarting the supervisor, but is there any way to increase the amount of memory allocated to a given container?

We have an influx container that runs out of memory on a regular basis.

Hey @jgentes,

the amount of resources a container can claim is not constrained by default. There are some options you can set in your docker-compose such as mem_limit which can modify this behaviour and e.g. set a limit to the amount of memory that a container can claim, but by default a container will take up all the memory it needs from the device it is running on.

Good to know… maybe by setting the mem_limit we will be able to get more detail about what’s causing the memory problems.