Developing on a Balena device

Hi all,

I’ve created this topic to discuss with the Balena team and community users on how everyone is developing on their devices, and if there are any issues or frustrations.

This topic is focussing on HOW you are developing, not issues in your code, but how you’re pushing the containers to your device or working locally on your PC and then creating containers etc

I’ve created this topic because in another topic of mine, we were talking about the balena push command. My opinion is that it’s very slow compared to the docker-compose command (talking about multi-container applications here). I have no idea if it’s fast when using just a single-container application, because I haven’t tried it. But when pushing code to an UP Squared with docker-compose, it works instantly. When using balena push <IP_ADDRESS>, it takes a long time to start building and a long time to start the container. And the UP Squared is a relatively fast board. Without the balena push, I’m missing the Balena ENV vars and Balena labels, so communicating to the supervisor isn’t possible, which is a bummer. So I was wondering, why is it much slower than docker-compose? And is anyone else facing this issue?

For everyone’s information, @CameronDiver has explained this in the other topic:

Hey, I wrote the balena push code so I’ll go through the reasons for this.

And regarding the balena push <IP> , it takes long before the push starts and after building, it takes some time before it really shows some logs. However, when I use docker-compose , it’s all instant.

This is more noticeable with larger code repositories, but essentially it boils down to the way that balena push tars the input to stream to the docker daemon. Because of the handling of things like Dockerfile.templates etc, the pipeline has to read the input at least once, transforming it as it moves through. Add in that the node module we use for this has some inherent problems, the performance is definitely not as good as raw docker-compose. Two things I’ve been planning for this, one is to be able to skip the Dockerfile.template et al. resolution, which will improve things, and another is to start working on a tar module (or changing the current) to be more performance aware (for example we don’t need to add files one by one as we do now, if we have the size beforehand, which we do). Unfortunately these just haven’t taken as large a priority as other more fun features (--live like @shaunmulligan mentioned :slight_smile: )

Another reason for this is that the CLI must talk to the supervisor to negotiate the push, and update the target state. This is necessary, but also can be optimized for fewer calls. Factor in that those calls are often handled by an rpi or similarly low-powered device, the responses can take some time.

Basically, we have this on our radar, it’s just not a huge priority right now.

Also, stuff like docker-compose down or docker-compose kill aren’t supported (afaik) with balena push <IP> .

Again, these are going to be upcoming, we are making a lot of changes to the CLI right now, and this is definitely earmarked as one of them.

Also, this feedback is fantastic :slight_smile:

I, and probably many more developers, understand that on a RPI3, it’s just slow. Regardless if you’re using balena push or docker-compose. The RPI3 is just a “slow” device for container building and starting etc.

So, TL;DR how are you developing on Balena? And are there any issues, frustrations or features you’d like?

Thanks!

Bumping this topic.

I’m using balena push to push a multicontainer application locally. But the balena push command is quite slow and doesn’t always work as expected (randomly just start 1 container instead of all of them). But using docker-compose, I can’t find a way to get the Balena labels working and the Balena environment variables. I want to communicate with DBus and the supervisor API in development and use some of the environment variables in Balena, but without having to use Balena Push.

I’m happy to create some kind of workaround, but is it possible to do this? Because working with balena push just takes too long and doesn’t work stable for me.

I have a single development device on my desk and on the local wifi network. I use an NPM script in my package.json as a shortcut to push to my device IP (npm run push). The app includes an ElectronJS kiosk app that gets precompiled on my primary work machine before pushing over.

Apt-get installs my the Dockerfile were taking about 15 minutes - so I created a custom docker base image using your guide (https://www.balena.io/docs/reference/base-images/custom-docker-base-images) that included all of the packages I needed. This cut builds down from about 18 minutes to maybe 3 minutes.

I just got the new CLI and am testing Livepush. Been waiting for this feature for a long time!

@bversluijs are you restarting your live push on every change? I have noticed that on start-up initial resolution takes a bit, but afterwards subsequent pushes should be much faster. Also, what version of the CLI are you using? Since local mode and live push are very much under active development, we recommend running the latest version (at the time of writing, that is v11.7.9). Note that there are some subtle differences when in local mode, but you can define environment variables in your docker-compose.yml (you can read more about some other local mode caveats here: https://www.balena.io/docs/learn/develop/local-mode/#caveats).

If something still isn’t working as expected in live push, if you could provide a minimal reproduction that would be very helpful!

Hi @xginn8,

I’m not restarting it at all, because live push seemed to did the jov. The first time, balena push did work. I’ve used it after a long amount of time because I need DBus to work for the NetworkManager. Before that, I’ve always used docker-compose, because it’s much faster. But with docker-compose, the labes and environment variables doesn’t seem to work (for dbus, networkmanager, env variable for the UUID etc).

I’ve upgraded the balena-cli to 11.7.9 and balenaOS to 2.39.0 on an UP Squared. I can’t really create a repository to reproduce it I think, because it’s just a multicontainer setup without any weird configurations.

I started balena push yesterday, and after a few hours, I went somewhere else, and when I came back, it didn’t work anymore. At first I got the error Maximum Stacktrace, after a reboot of the UP Squared and stopping all containers, I tried balena push again. Sometimes the database started, sometimes the node container. But not all containers.

The best solution for me is get the labels and env variables (dbus, supervisor api etc) working with docker-compose. That’d be awesome, because with docker-compose I have full control for developing multicontainer applications.

Hey @bversluijs

There’s a couple of things here that I think it’s worth digging into. Firstly, what exactly do you mean when you say the labels and environment variables don’t work for you? I use them quite a lot both in local mode and cloud and I’ve never had a problem.

The maximum stack trace error was indeed a bug in the supervisor, this has been fixed as of version 10.1.5 of balena-supervisor, which should be released as part of a new balenaOS soon (PM me if you would like me to update your supervisor in-place, although we try not to move between major versions without upgrading balenaOS).

You also mention that often balena push won’t start containers, what I think is going on here is that if an image has not changed and the container created from that image is still running, the supervisor will not restart the container, and instead leave it running. This could look like balena push is not starting the service, but it’s worth checking that the container is already running next time this occurs. If you did make a change to a service and it was not updated, that is most certainly a bug and having a reproduction would be great. That being said I’ve never seen it, so I would check the above first.

Please let me know if you’d like me to go into any further detail here, and thanks for starting this thread! I have a feeling it will help both users and the balena team :slight_smile:

Hi @CameronDiver,

Thanks for answering. I think it’s also worth digging into these issues.

First, let me explain the environment variables and labels. They work correctly if I use balena push. But most of the time, I use docker-compose to build & start the containers. Then it looks like that the labels and environment variables have no effect in the container. For example, I can’t connect to the dbus and can’t get the BALENA_UUID environment variable when I use docker-compose. This works as expected with balena push. But because docker-compose is much faster (for me), I prefer to use that. And I’d really like to keep using this command, but with support for the environment variables and labels working :slight_smile:

About the supervisor, I’ll PM your for that, because I want it to work on my development device so I can continue developing without this bug. I don’t care for the production devices and of course they don’t run into this issue.

About the containers not starting. I stopped all containers on my Balena device, because I wanted a “fresh” push. I thought the maximum stacktrace error had something to do with this. So I SSH’ed into the balena device, and stopped all containers using balena stop. Then I did balena push <ipAddress> and everything was building, but only 1 container started. I checked it on the Balena device, and with balena ps, I only saw the supervisor and 1 container.

Ah, I misunderstood your use case for the labels and environment variables - yeah this isn’t supported, and I’m not sure how we could get it to be unfortunately. If the supervisor doesn’t start the containers or manage them in any way, there’s no way that we could add the features and environment variables that these labels control. Currently, and for the foreseeable future I think, balena push would be the only way to do that.

What you describe regarding balena push definitely sounds like a bug, do you have any tips on how I could reproduce it? I do all my supervisor development testing using balena push, so I have a feeling there’s some edge case in the input that can cause this. Also if you have any logs from when this occurs (journalctl -f -u resin-supervisor from the host OS) that would really help.

That’s too bad. But I’d love to work with balena push, but it’s much slower for me and I don’t have full control, because with docker-compose I can stop the containers or even kill them, which I use from time to time. Just to manually “force” stop/kill the containers. But the real issue is that it’s slower, because it takes more time to develop, and I’d rather use my time for development than for waiting :slight_smile:

I’ll keep you updated about the balena push issues. I think I’ll post my findings tomorrow, because then I’ll continue working with the device.

If you have some information about why the balena push command is slower than docker-compose, I’m happy to learn from it. Because, as you now know, this is the main reason why I don’t like to work with that command (However, I must say with live pushing, it’s a really awesome command).

Mainly really in addition to the thing you noted at the top of the thread, the CLI itself can be slow to start up and begin the jobs required, but this is also something we’re working on.

Hi @CameronDiver,

I’ve manually stopped all containers, because the containers weren’t synced with balena push. Then I executed balena push again, and none of the containers were started. So the good news is that it’s reproducible by manually stopping the containers on the device with balena stop and then start them again on the PC you’re developing on with balena push.

These are the logs after executing journalctl -f -u resin-supervisor:

Aug 13 08:27:43 c8e3d19 resin-supervisor[14616]: [api]     GET /v2/version 200 - 5.265 ms
Aug 13 08:27:55 c8e3d19 resin-supervisor[14616]: [api]     GET /v2/local/device-info 200 - 10.040 ms
Aug 13 08:28:07 c8e3d19 resin-supervisor[14616]: [api]     GET /v2/local/target-state 200 - 41.084 ms
Aug 13 08:28:07 c8e3d19 resin-supervisor[14616]: [api]     POST /v2/local/target-state 200 - 20.849 ms
Aug 13 08:28:07 c8e3d19 resin-supervisor[14616]: [info]    Applying target state
Aug 13 08:28:07 c8e3d19 resin-supervisor[14616]: [api]     GET /v2/local/device-info 200 - 49.533 ms
Aug 13 08:28:07 c8e3d19 resin-supervisor[14616]: [debug]   Finished applying target state
Aug 13 08:28:07 c8e3d19 resin-supervisor[14616]: [success] Device state apply success
Aug 13 08:28:07 c8e3d19 resin-supervisor[14616]: [api]     GET /v2/state/status 200 - 275.609 ms
Aug 13 08:28:08 c8e3d19 resin-supervisor[14616]: [api]     GET /v2/state/status 200 - 191.301 ms

@bversluijs
Thanks very much for the reproduction. I’ve made an issue here to track it: https://github.com/balena-io/balena-supervisor/issues/1059

I’ve got a couple of pressing items on my plate at the moment, but I’ll look into this as soon as I can.

1 Like

Coming back to this. I’ve been using balena push for a while with the updated supervisor, but it has been much faster since. I still lack the support of really forcing all containers to stop, but the live push function is great!

@bversluijs Thanks for the feedback :slight_smile: We’ll keep you posted on the issue progress.