App lockfile or supervisor lockfile working too well


#1

I’m creating my application lockfile using flock:

while true; do
    echo "starting camera_controller_resin ${SERVER_URL}"
    flock /tmp/resin/resin-updates.lock -c "./camera_controller_resin \"${SERVER_URL}\""
done

And what I’m seeing is that the update is downloaded, and often only applied if I force a reboot through the actions panel. Sometimes I can reset the application and get the update applied, sometimes I can reboot or reset the device but often get the error:

Request error: Updates are locked: EEXIST: file already exists, open '/mnt/root/tmp/balena-supervisor/services

Devices are running:

  • Resin OS 2.13.6+rev2
  • balenaOS 2.26.0+rev1

I have tried enabling the lock override, and setting it back to disabled. But that appears to only affect the app locks. The error above appears to be to the balena supervisor, which requires using the device Action Reboot.

Suggested workflow for addressing the root issue?


#4

Hi there @jason10,

Lockfiles are typically used to prevent the supervisor from killing an application at a critical time, such as preventing data loss. It looks to me like you are daemonizing the application while locked, such that the supervisor cannot ever update it (without the interventions you mention). It stands to reason that the supervisor would refuse to restart an application while the lockfile is held exclusively.

What you are saying is that the reboot actions only work some fraction of the time, and sometimes that EEXIST error is thrown. Is that correct? It looks like the line was cut off, is it possible for you to paste the whole line in the logs?

Thank you very much!


#6

Hi Matthew,

My program exits occasionally so that the lock can be released. Perhaps adding a sleep 1 would help.
Is there a way for my program to check that an update is ready to be applied?

Yes the reboot works sometimes, and other times I get the error.

Here is the error in full:
Request error: Updates are locked: EEXIST: file already exists, open '/mnt/root/tmp/resin-supervisor/services/1275531/main/resin-updates.lock'

So is that because my app has created a lock or because of a different issue?


#9

@jason10
It is possible to detect if a update is ready to apply while the lock has been taken by using the following endpoint on the supervisor api:

curl --header "Content-type:application/json" "$RESIN_SUPERVISOR_ADDRESS/v2/applications/state?apikey=$RESIN_SUPERVISOR_API_KEY"

which will return an object similar to:

{
  "supervisortest": {
    "appId": 1011165,
    "commit": "42a5d01723cac00538fb10df5ea0a671",
    "services": {
      "main": {
        "status": "Downloaded",
        "releaseId": 688486,
        "downloadProgress": null
      }
    }
  }
}

What you’re looking for is the Downloaded status on the services.
Let me know if this helps you!


Device not picking up changes to device variables
#11

@jason10 to add to Cameron’s answer - that EEXIST error on the supervisor is expected, and is indeed just the supervisor’s way of saying it’s not applying the update because your app has taken the lock.

I think indeed adding some sleep time between the locking attempts would help - the supervisor will only attempt to take the lock periodically, so if the window is too small it is likely that it will never be able to take it.