TPM / full disk encryption (x86_64 generic)

Looking for a method to store our privacy sensitive data, I came a cross the ideas to use a TMP in combination with full disk encryption. Whether it is about private keys or user data, as long as the device is turned off it would be secure when we can use this method to encrypt the disk.

There has been some activity on this topic both on the forum as well as through a release party in November 2021 but I could not find any progress on it since. I could find some of the PRs that were merged in relation to this topic but I an only imagine that the feature is not finished yet since I could not find the documentation on how to approach this.

Is there any reason that the feature stalled in development or is there documentation somewhere on how to use it where I forgot to look? It would be an awesome feature to have for Balena since it will make it that much easier to use in environments that require a higher level of security.

1 Like

Hi @cees.koolen,

I will let one of my colleagues who are more in the loop on full disk encryption steps provide a fuller answer on the point, the short version I believe being that it is on the road map but there are a number of steps to go before it would be implementable.

I however, have been looking at a container based option for something that may be of interest, and perhaps plausible in the short(er) term. I would be interested to hear more about your use case. What sort of data are you storing, is it databases or static data? Is it in volumes, environment variables or in pushed containers? When you say ā€œusing TMP in combination with full disk encryptionā€ what do you mean exactly? Using the /tmp directory and an encrypted volume or mount?

Hi @maggie0002

Thanks for your follow-up questions. Iā€™m sorry but I mis-typed an acronym in the title. Itā€™s about the TPM (trusted platform module) that is used in combination with full disk encryption.

For that feature, there are mainly 2 things that we need to secure:

  • the sensitive application data (mainly pictures)
  • the private keys used by our services

So we could use an encrypted volume or mount but the problem is then just shifting to where do we store that particular encryption key. That is where secure boot and the TPM come into play.

Thanks for the feedback. TPM makes more sense.

In terms of the mounted/encrypted volume, the idea would be that the key is served from one device on the network (or online if secured) to all the others (really simplified explanation). Which would mean only securing the one device and that encrypted content in the volumes/containers specified on your other devices would only be decrypted when they boot on your network. This isnā€™t of course an alternative we are proposing to the ongoing work towards full disk encryption, just a side project, although does avoid having to encrypt an entire filesystem that can have a performance impact on very low powered devices. I would be interested to hear if there would be any use case for something like that in your environment.

The idea is interesting though somehow we need to verify the identity of the device that is requesting for the key from that central secret repository. I would assume that one could use an environment variable for that but that is effectively leaking it to the device.

Hello @cees.koolen The secure boot implementation we are working on requires TPM 2.0 compatible hardware. We have implemented LUKs based disk encryption as well using the TPM chipset. Another thing that might be possible is using Yubi based disk encryption (see for example Yubikey based Full Disk Encryption (FDE) on NixOS - NixOS Wiki) but we do not support it on balenaOS.

What hardware are you using?

We are evaluating an intel based board with a TPM 2.0 embedded on the board. So to me that sounds like a good fit. The Yubikey based FDE also sounds interesting but that would probably require us to mount an encrypted partition in a container our selves. The drawback of that solution still is that it does not really protect against basic physical threads since you only need access to the Yubikey itself.

1 Like

@cees.koolen, mTLS is the approach used for communication between the key server and the device. Although namely it would place the security in to the hands of the user who manages the network. I.e. there would be ways around it if you had no WiFi password on your network or anyone could access an ethernet port and jump on your network (steal a device, extract the mTLS certificate, connect back to the network the device was on, request the key, go away and decrypt the device). If hosting from an online service, the server could be configured to only accept requests from IP addresses of your network.

Certainly not an alternative to the ongoing work on full disk encryption, but may provide an avenue to place a level of security of content in to the hands of users. It will however, only be as secure as the weakest link.

I will be sure to drop a message here when it comes to fruition.

@maggie0002 thanks for the further explanation. Iā€™m happy to carry that thought train a bit further as well.

So there is an mTLS connection between the ā€œvaultā€ or ā€œsecrets serverā€ and the device that needs those secrets. But that pushes the problem to where do we store the mTLS client key would it not?

It does push the problem to the key storage, or more specifically it places decryption dependent on to two places, the mTLS key on the client, and the server serving the encryption key with the parallel mTLS key. So someone would need access to both the mTLS key on the device and access to the server serving the key. Which is where network security, or security of the online server comes in to play.

A few scenarios. (1) Bring the ā€˜secrets serverā€™ on to the network while the devices boot and remove it again (a sort of boot key). (2) Host the secrets server on an online VPS, and restrict access to that server based on the IP address of the network the device is connecting from. (3) Ensure security of the network the devices are run on, for example make sure nobody has your WiFi password, then security of the content on each device all lay in one key server to be secured.

Itā€™s hard to elaborate specifics without knowing specific scenarios, and certainly not without gaps.

I think Iā€™ll simmer a bit on the topic to see if we could make it fit. It also depends a bit on how long it will take the OS team to finish the awesome work they are doing.

1 Like

@cees.koolen, while I am still not sure this is going to be right for your particular use case (your hardware potentially permits other options) here is a go at some of the ideas we had discussed: GitHub - maggie0002/secure-store

All the usual caveats, it is an experimental project just to see what potential there may be.

1 Like

@maggie0002 thank you for pointing us in the direction of that experiment and keeping this thread alive.

What I was thinking about is that we might be able to use the Balena VPN as the secure environment for the secrets store. If we can somehow validate the identity of the requester through the Balena API / VPN we can ensure that we only deliver the keys to devices that are in a particular fleet.

For example if we use the Balena API to forward a port to the device that requested the secrets and send the secrets through that forwarded port, we can ensure that only devices that are members of the fleet will be able to receive the data.

Obviously that still means that we need to trust the storage of the VPN keys on the device but it will add a layer of trust by verifying the identity of the requester without the need of storing mTLS details on the device.

1 Like

@cees.koolen, Iā€™m eager to keep brainstorming it, this is helpful. I have read your post a few times though and not sure I understand fully the idea. Would it be:

Device contains no env variables ā†’ device successfully communicates with Balena through the VPN ā†’ because of the successful communication the environment variables are now available.

Which then moves the point of security to securing the balena API keys?

@maggie0002 My idea is that when the Secure Store Client requests a secret from the Secure Store Server that the Secure Store Server then verifies the ID of the client through the Balena API / VPN.

So for the Server to be able to do that, it indeed needs some API keys that need to be kept safe thereā€¦ but since the server already contains all the secrets, adding these there might not be that big an issue.

1 Like

@cees.koolen I think I see now, this is very interesting!

So Secure store client passes its details to secure store server ā†’ secure store server verifies the details for the client with the Cloud ā†’ if valid it passes the unlock key to the client and the client decrypts.

What could be really good about that is it would mean if someone was to remove a client device from the Cloud (such as one that is lost of stolen) then that device would no longer be able to decrypt the content under any circumstances. It would be a way to deprovision a device.

I think it would be a blocker for offline devices though, so perhaps would best be optional? I do like the idea of having less steps by not needing the MTLS keys, but the MTLS keys also provide the secure offline option (by offline I mean on a network, but without internet access) and secures the traffic in transit between the devices. The latter may be overcome by simple TLS, but then we wouldnā€™t want to verify a certificate against an external key server, both for hassle of managing it, but also for offline mode, and overriding that is something that just kinda feels clunky.

I did look at one point of trying to put the MTLS keys in the Cloud as environment variables. It may reduce the friction of the setup a little. Technically, if the Secure Store Server has a more permissive API key for the Cloud then perhaps the server could generate the MTLS keys and add them as environment variables for the entire fleet automatically (that is assuming we could store it in a single line environment variable and then extract it again in the right format). My concern with this is on first add it would restart the containers on all the devices, even those without any of the secure client or server, the entire fleet (adding env variables to devices in the Cloud restarts the containers on attached devices). Iā€™m not sure how big a deal that is.

There is lots of thinking out loud here, I will keep mulling it over. I think your idea is really good. If you have others on the above, would be great to hear about it. Thinking through whether to MTLS certificate or not to MTLS certificate is the question (or one of the big ones); user hassle of the setup vs security vs what happens if someone wants to replace the certificates.

cc @mpous @wjlove @rosswesleyporter

@maggie0002 I agree that the scenario for offline systems is really different.

Also the idea of provisioning the mTLS certificates through the API falls apart in that scenario since changing the environment variables on offline devices will not do anything until the device connects to the Balena servers again. For systems that consist of multiple services it could be done by just setting the variable for the specific service that requires it. That would at least make the action less intrusive. For the time being, Iā€™ve used this method of setting mTLS certificates on one of our services since that gave us the opportunity to continue development of the application without depending too much on the final solution for the key management. We used the base64 encoding method such that we could set the keys as a single line of code in the environment flags.

As you wrote in your reply, I would indeed still use regular TLS to connect to the Secure Store Server and giving the clients a certificate to verify its authenticity is rather simple but really important.

I still think that for the offline scenario having the Full Disk Encryption with Secure Boot would be a life saver.

The idea behind provisioning them through the API wasnā€™t for the offline devices, just in terms of it being an easier setup. I assumed the reason you had said without the need of storing mTLS details was because it is a bit of hassle to manage?

Setting it for the specific service seems like a nice idea, that way it would only trigger restarts on devices that have the secure store client. Which seems like a restart would be necessary anyway, otherwise why are they running the client.

I still think that for the offline scenario having the Full Disk Encryption with Secure Boot would be a life saver.

Absolutely, for offline and online Full Disk and Secure Boot would be far better, and certainly none of this detracts from that work. Purely an exercise for users without TPM. :smile:

Deploying the mTLS certificates through the Balena API effectively also is proving the identity of the clients through the Balena VPN.

Except I think then the MTLS certificates would remain on the device, and would keep them even if the device was removed from the fleet at the Cloud level (at least until it connected to the Cloud and then uninstalled the containers and cleaned up the images).

By having the MTLS as environment variables, and then the server verify the client details with the Cloud too, then we could deprovision a device, whereas if the MTLS keys were compromised there would be no way to deprovision without rolling out new MTLS keys. Perhaps rolling out new MTLS keys isnā€™t such a big deal, as long as the container restart isnā€™t an issue. I wonder if it may be better to have the provisioning and deprovisioning at a client level rather than the whole fleet though, then by simply deleting a device from the Cloud, it is also deprovisioned from the secure store server without any extra steps. Downsides, having the server do the verification means adding a very permissive API key to the server, which makes the server more of a risk. Plus doing client ā†’ server ā†’ cloud verification is more work to implement than just using Cloud store MTLS keys for which the functionality is basically already there.