Replicating a device from an image already attached to a fleet

If I have a device attached to a fleet, and then I make some modifications to that particular device, can I then create an image from the devices SD card, flash that image to another SD card, and then have both devices run on the fleet simultaneously without conflicts?

The scenario/problem is trying to find a way to allow users to make slight changes to their own device through a UI and then be able to replicate those devices multiple times with their changes included, while still benefiting from the fleet updates etc. All the user modifications are stored in device volumes so no issues of persistence, just a question of whether you can join multiple devices to a fleet that way. My first thoughts would be UUID clashes? I don’t have two of the same hardware kicking around to be able to test it right now.

If anyone has other ideas on a workflow for achieving this, would be interested to hear those ideas too.

@maggie0002 , off the top of my head, I think you could achieve this by having multiple devices on a FLeet running separate releases. One user flashes a device, makes changes, pushes the changes to a release and keeps using that release while other devices run different releases. It’s a bit messy, but I think it fits your scenario. Let me know what you think about this, and then I can get back to you with some other options/solutions.

The goal is for users to be able to do this by themselves without access to the fleet dashboard, or relying on me to change things. I would want all the devices to receive the latest updates and run at the latest version, but just be able to replicate devices by cloning a card, without clashes in UUID or other Balena fleet related identifiers.

Hi @maggie0002 would these changes be made to the services running on the device or to the dashboard themselves? For example, let’s pretend you have a fleet of devices running the Inkyshot. Would you want users to be able to change the quote which their devices show? This kind of interaction could be handled through an interface which is not the Balena dashboard. However, if you would like users to be able to change dockerfiles or a change to the software running on the device, then that might be a bit more complicated.

I’m not quite sure on the scenario you outlined. Simplest way I can think to put it is users would change content in volumes, and the device with that volume content would then need to be replicated in as simple a way as possible.

Hey @maggie0002, I’ll try to add some thoughts here as I think it depends on the type of change. Although you’ve stated that it would be content in volumes, I’ll try to cover multiple eventualities.

One workflow I’ve been thinking about quite a lot is that where you have a fleet, and you flip one of the devices into development mode. You are able to use an IDE running on that device to develop the application and make changes, then when you’re happy with them, you do a balena push to create another release on that fleet which allows you to roll out all of the changes you just made to all of the other devices, and also flip that device back into normal mode and it will receive the update as well. The updated release can of course include files which are then copied into user volumes.

Another workflow is where you could fork a fleet. Working with the example that a device is a member of an open fleet, they could fork that fleet, which essentially comes down to automatically creating a new fleet, pushing a release to that fleet, and then moving the device over. This would then allow any any changes to be made in a similar way as above, but instead creating a new fleet to run these changes instead of applying them to the original. In the same way as above, the release could contain the files which are then copied to the user directory.

A third, more direct approach could be using something like IPFS, or min.io, or some other distributed filesystem to synchronise files between devices in real time. That way you could continue to provision devices with the standard, ‘empty’, image, but as soon as they provision and join the fleet, they would receive the latest copy of the files you’re talking about.

I think in summary the difference is that if you want changes to be replicated across devices within a fleet as part of their day to day running, use some means of file distribution from device to device. If you want the changes to be replicated across the fleet as part of the software that’s deployed to them, and as such receive the benefits such as the ability to deploy new, roll back etc., then it’s better to use releases. It depends if it’s a continuous sync and deploy or an iterative thing, I guess.

Either way, pulling SD cards and cloning them is not something we ever really consider because that requires physical access and is kind of the heart of the device. As you suspected, it would cause issues with UUID clashes, because once that device has been provisioned it is no longer generic and contains the UUID as well as device-specific API key. It holds the identity of that device.

If you can share more information about the use case you have in mind I may be able to be more specific with my suggestions :slight_smile:

Thanks for outlining some of these workflows.

What I am considering unfortunately has a number of limitations. I would not be able to access the devices, they are offline devices most of the time, and often work independently so the idea of transferring content from online or via a nearby device isn’t an option. It would also need to be done by someone who doesn’t have access to the dashboard.

Thanks for clarifying that the cloning wouldn’t work. I think the next logical question is whether there is a way to alter the UUID on a cloned device? Maybe through the config file? It would allow me to clone a card, then alter the file so they all join as new devices on each fleet, but share the same volume content.

I think it should work if you clone the card and then replace the config.json with an unprovisioned one. If you download the configuration from the add device modal instead of clicking flash or downloading the OS image like normal, you can use it to replace the device specific one from the cloned SD card. Whenever that new card is then powered on it should provision a new device into the fleet as if it was a fresh image.

That’s interesting, particularly the comment on unprovisioned devices. What if I was to preload an image, connect to the device and populate some content in the volumes, but didn’t let the device connect to the internet. Then cloned the card. Would that then allow both to connect to the fleet without a clash, as the original hadn’t yet connected to the cloud and fetched its own UUID?

I wonder if the data volume has been initialised at that point, but if it has, I think that would work as expected. At that point though, how would that be different to including the ‘content’ in the release so it’s added as part of the preload?

I don’t think a preload could modify volumes without some trickery? I am aiming for user friendly. The devices already work offline, a user can download the preloaded image and flash it to a device, then use a web ui to upload content which is stored in a volume. The goal is to find a way to then replicate that state.