Deploy new release from inside the BalenaOS device, independently from BalenaCloud

Hey everyone,

I am trying to deploy a new release from inside a balenaOS device, the idea is that the Edge component can run independently even in the case of a cloud failure.

What I want is that the device is able to add a new service in it’s compose and push the new release onto itself.

My solution as of the moment is to run dockerized the balena-cli as a service and use the balena local push command, while at the same moment sshing into the host OS and loading the new service from a tar file via balena-engine load command. Ideally I would like to use the supervisor/balena-engine API to do that and not adding the entire balena-cli suite.

Any ideas, suggestions?

Will post a tutorial and will open source the project as soon as I finish, it’s a bit awesome :stuck_out_tongue:

Cheers!

Another option I theorise that it might work is the following:

  1. ssh into host OS or use balena-engine socket from inside the “orchestrator service”
  2. load manually each service container from tar files, save each imageid
  3. use the supervisor api and the POST /v2/local/target-state endpoint

It seems much more complex but elegant, what do you think?

From the supervisor API I have to construct something like this, not sure about many fields though.

"apps": {
            "1": {
                "name": "localapp",
                "commit": "localcommit",
                "releaseId": "1",
                "services": {
                    "1": {
                        "environment": {},
                        "labels": {},
                        "imageId": 1,
                        "serviceName": "one",
                        "serviceId": 1,
                        "image": "local_image_one:latest",
                        "running": true
                    },
                    "2": {
                        "environment": {},
                        "labels": {},
                        "network_mode": "container:one",
                        "imageId": 2,
                        "serviceName": "two",
                        "serviceId": 2,
                        "image": "local_image_two:latest",
                        "running": true
                    }
                },
                "volumes": {},
                "networks": {}
            }
        }
    },
    "dependent": {
        "apps": [],
        "devices": []
    }

Hi there!
So what you described first sounds like a solution for now, but we would need to know some more behind-the-scenes to answer properly: why are you trying to make the device push a release onto itself?
Generally speaking we have two “modes”: offline or online. This is because balenaCloud is unaware of the changes performed on the device and if you diverge from the cloud, once you come back, it should overwrite your local changes, unless those were pushed on the cloud of course, which defeats the purpose of this.
If you can explain a bit more in-depth details about what and why, maybe we can find/develop a proper solution to this situation

The idea is that I want a device to be able to add a service in a automated way, without the need of central control. In essence an orchestrator service will be able to provision the device and add/remove services at will (or specific) events.

A small feedback: I would suggest to add a more complete example on the set/get target state, I can provide my own json file if you want. This way, the reader will be able to get a better gist of what exists in the state data structure.

P.S Huge congrats on the latest round, you guys rock!

Hi,

Would it be sufficient for your use case to activate/deactivate services? I could imagine, that you have a controller service, that tells other services if they should run or should stop. This would not require a new deploy of the whole application. You could for example set a device service environment variable called ACTIVE to true in case you want the service to run. Your service would just need to check if there is an env variable called ACTIVE that equals true and stop/sleep etc. if this is not the case.

Best regards,

Hey,

Thank you for the thorough response. Your suggestion is half the answer,as I want to ad hoc provision services and then perform the activities you describe. By provisioning I mean to add/remove them on the run, without the device to know a priori the services that will be provisioned.

BTW, I will surely post a tutorial regarding my use-case and the service I am building, Balena is truly a robust platform.

Hi, we do not support currently adding/removing services on the fly like that indeed. Mostly because our model is pushing applications through our builders, storing them to a docker registry and OS/supervisor downloading the images for the corresponding services.

However we are interested to learning more about your use-case and your approach.

Thanks,
Zahari

Hey,

I understood that it was possible to add services in local mode using the set state endpoint, am I correct? You refer to production images I suppose.

An interesting approach would be for the orchestrator service to monitor the whole operation and use the balena-cli or balena endpoint to push new updates, as normal, but in case that the cloud in unreachable, it falls back to local-mode in order to work as intended. On this subject, is it possible to push a new compose using some API endpoint instead of the balena-cli? Balena-cli if I recall from some research can be quite a heavy image when dockerized.

The only issue is the persistence of data, but I am sure we can tackle that as well, maybe by using the hostOS persistent storage read/write folders as “bridges” to persist important data needed to continue the operation.

What do you think?

Thanks as always for your time!

Cheers,
Odysseas

Hey @odys,

This is an interesting idea, but I think maybe we might be able to come up with a simpler suggestion if we understand completely the use case. Our model is one of updates, so that a new release occurs when there are version updates to a particular service.

I think maybe the critical point here is ‘is the cloud service is unreachable’. Do you mean our balenaCloud backend, or the internet connection? Our cloud service is fairly unlikely to be unreachable as long as the device has an internet connection. Similarly, we have many customers who have secondary interfaces (such as GSM modems) to ensure that should the primary interface fail (eg. an ethernet connection) a secondary can work in its place. One of the issues I see here is even if you pushed a new release from the device, effectively to itself via local mode, if there are any dependencies that require building in that service, a lack of internet connection will mean building fails anyway. Finally, should the internet connection fail, any services running on the device will continue to run, regardless of lack of internet (although if those services require an internet connection then obviously they will fail).

If you have a set of pre-defined services, are we actually discussing customising the services that run on the device at any one time? We actually have a Supervisor endpoint that allows you to stop/start individual services, so you could load all of the services you need into a single application, and depending on net connection enable/disable them via the ‘orchestrator’ at runtime.

Unfortunately, what you’ve suggested switching between balenaCloud and then local mode isn’t particularly simple. The API service informs the Supervisor whether it should operate in local mode or not, and this requires our cloud backend to inform the Supervisor when this occurs. It’s currently not possible for services from balenaCloud to share persistent data with those in local mode. This causes issues with using the same data between these configurations as your services won’t be able to access the data stored in the other mode (and we don’t currently allowing the volume binding of arbitrary directories into services).

Would it be possible to give us some step-by-step examples of the type of use-case you are thinking of, as I think that will help us a large amount in offering you what maybe a simpler solution?

Thanks and best regards, Heds

Hi @hedss,

I firstly would like to thank you personally and the whole balena team for your time and thorough responses in a moreo of brainstorming issue than real customer assistance. I appreciate it deeply.

Regarding the use case, yes i have come to the conclusion that maybe balenaOS without the balenaCloud would be a more appropriate place to start building my solution. Nevertheless, for now as I want to make a POC for my Master’s thesis, monkey-patching the balena suite is more than enough. To that end, simply having local mode and using the set-state supervisor API will suffice.

The use-case I think is that a device will be given (abstract for now) a container to run, in order to offer a service of sorts. The idea is that the binary will be already built and the device will simply mount it and run it. It’s an important point (I think) that this activity can be conducted without the assistance of a centralised authority. It is also possible that the image will be procured in time A where Internet is available but it will be used at time B where internet is not available.

BTW, although I have known the platform for several years (3-4) due to originating from the local ecosystem, it is the first time that I truly get deep into it, the more I explore the more I LOVE it.

Cheers!

Hey @odys, thanks for your thorough feedback. Glad that it sounds like for the time being you are set! Yeah, usually it works better if there’s some non-abstract use case, so we can understand it better.

Some more questions from our side:

That’s not something that we see in our usual use cases on the platform. Most of the time the users want that central authority (the single “point of truth”, that is the API here) to set state. Otherwise, devices can be locally subverted, without the user not knowing “upstream” either. Basically closing the loop on the cycle represented here: https://www.balena.io/docs/learn/welcome/primer/#code-deployment (user pushing code to the cloud, and receiving in return logs from the device).

Wouldn’t this be the current situation? When there’s internet, the device will get the image(s) to run, and the internal logic inside there can modify how things work when there’s no internet (while no new download is necessary).

The mixed internet / no-internet mode is still an interesting one, and if you have any further thoughts, we are always keen to brainstorm more ideas. We try to build the platform to the real-world use we observe (hear about:), and the more we hear, the better choices we can make in developing the platform.

Glad you are enjoying building with balena, really looking forward what are you making! If you are making any patches to any components in a way that you feel that it would be useful for others too, feel free to send pull requests to us on those components, can’t guarantee what happens, but there are real chance to improve the platform for everyone.
And best of luck for your Master’s!! :mortar_board:

I think that it’s a vertical that indeed is somewhat counter intuitive to the BalenaCloud perspective. I would say that is more of using the BalenaOS and a customised version of Supervisor, with the added benefit of being compatible with the BalenaCloud. I would love to talk more about it if you are interested, even in a skype call (much more time efficient than forums).

Thank you guys again for everything, you are truly rockstars!

Hi, the forums are actually the best place to discuss as they allows all of us (as opposed to only a few within a skype call) to reach back to you for more details, given we also work on different timezones! Feel free to share any more information you have on the subject to help us understand better and as previously mentioned by my colleague, feel free to also send pull requests to those components you think could let others gain benefits!

1 Like

Hey everyone in this thread,

I have a small question. I am trying to find the LAN landscape from inside a container. The thing is that my container only sees the other containers, network wise.

The monkey-patch I have thought, is using ssh into host and then performing the command (get ip from mac-address of a computer in the same network). Another solution could be using dbus, but I haven’t really understand how to use it. Could you point me towards the most efficient (and lightweight) solution?

Thanks as always for your time!

hey @odys you can run your container in host network mode https://docs.docker.com/network/host/

Hey @robertgzr,

Thank you for your reply! Network mode used to be bugged and I haven’t used it for many months.

Will try your solution tomorrow. For the time being, , this is my solution for anyone interested:

LORANK_IP=$(ssh -oStrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null root@172.18.0.1 -p22222 "nslookup lorank8.local" | grep "192" | awk '{print $3}')
export LORANK_IP