Understanding delta sizes

Is there an easy way to estimate the true data size of a delta update that is about to be pushed (or already has been pushed historically) down to devices?
I’m using a unlimited cell plan (capped at 64kbps) which is terrific, but slow. And I’m trying to better estimate the download duration of code changes would be, big or small.

I can see the image sizes in the release history in the UI, but that shows the image size, not the actual delta size that was pushed down in that release.

Could it be that changing a single line in a nodejs app inside an alpine image only pushes the actual bytes that changed? There’s got to be some overhead, but it’s hard to know what that overhead is.

For context 64kbps is only 8 kilobytes per second, which means pushing a 250mb update down to the device would take over 8 hours. A 1GB update would take almost 2 days.

Hey @barryjump yes this is possible via API or SDK. Let me try to tell you more details below.

First of all, a delta is unique between any combination of (delta version, source image ID, destination image ID), so you’ll need to determine the appropriate values before querying the API.

If you have the version and the image IDs you can query:


$ curl \
  -H "Authorization: Bearer ${TOKEN}" \
  'https://api.resin.io/v4/delta?$filter=((status%20eq%20%27success%27)%20and%20(version%20eq%20'${DELTA_VERSION}')%20and%20(originates_from__image%20eq%20'${SRC_ID}')%20and%20(produces__image%20eq%20'${DST_ID}'))'
  | jq .d[0]

If you only have the names of the images involved you can use:

curl \
   -H "Authorization: Bearer ${TOKEN}" \
  'https://api.resin.io/v4/delta?$filter=((status%20eq%20%27success%27)%20and%20(version%20eq%20'${DELTA_VERSION}')%20and%20(originates_from__image/any(source:source/is_stored_at__image_location%20eq%20%27'${SRC_NAME}'%27))%20and%20(produces__image/any(destination:destination/is_stored_at__image_location%20eq%20%27'${DST_NAME}'%27)))'
  | jq .d[0]

where SRC_NAME and DST_NAME should be the image name complete with the registry without a tag or SHA reference.

To get a list of all the services and the size of the delta updates for each between two release IDs, run the following from the browser console, release IDs can be found easily in the URL of the releases page of the app:

release1 = <OLD_RELEASE_ID>
release2 = <NEW_RELEASE_ID>

sdk.pine.get({
  resource: 'release',
  options: {
    $filter: {
      id:{ $in: [ release1, release2 ]},
    },
    $orderby: 'id asc',
    $expand: {
      image__is_part_of__release: {
        $select: 'id',
        $expand: {
          image: {
            $select: [
              'id'
            ],
            $expand: {
              is_a_build_of__service: {
                $select: 'service_name'
              }
            }
          }
        }
      }
    }
  }
}).then(([r1, r2]) => {
  r1services = { }
  _.each(r1.image__is_part_of__release, (ipr) => {
    r1services[ipr.image[0].is_a_build_of__service[0].service_name] = ipr.image[0].id;
  })
  r2services = { }
  _.each(r2.image__is_part_of__release, (ipr) => {
    r2services[ipr.image[0].is_a_build_of__service[0].service_name] = ipr.image[0].id;
  })

  deltaSizes = { }
  return Promise.all(_.map(r1services, (id, name) => { 
    return sdk.pine.get({
      resource: 'delta',
      options: {
        $filter: {
          originates_from__image: id,
          produces__image: r2services[name]
        },
        $select: 'size'
      }
    }).then(([ img ]) => {
      if (img != null) {
        deltaSizes[name] = img.size
      } else {
        deltaSizes[name] = 0
      }
    });
  })).then(() => console.log(deltaSizes))
})

Let us know if this works for you!

And of course, I remember that you are asking that you would like to estimate. I just created a pattern inside balena.

BTW @barryjump a colleague had a workaround idea to measure the size of the delta. If you have a parallel fleet without any device and you push there first, you can understand the sizes of the deltas before you push to the “production” fleet.

Does it make any sense this workaround on your use case?

And let me complement my previous answer @barryjump

For the current deltas version (v3), the size given by the API is the uncompressed delta size, but during a pull the data will be compressed. The number given by the API can be substantially larger than the amount of data that will be actually transmitted.

Excellent thanks @mpous I’m going to experiment with the api a bit this weekend.

If it works, I’ll do some more experimentation with integrating it as a step in github actions, so after the balena ci step, it could print out the delta size.

Please do share what you learn :slight_smile:

El dg., 19 de juny 2022, 23:00, barryjump via balenaForums <notifications@balena.discoursemail.com> va escriure: