We’re currently evaluating using Resin as a way to manage and deploy our code. We have about 7 test systems running in our office. We moved a couple (successfully) from one application to another, but then the following day both units stopped talking to our servers. They also came up as off-line in the resin dashboard.
After a power-cycle they then came back to life.
This seemed to coincide (no complete proof) with some transient DNS issues that we were seeing.
I was wondering whether it would be possible to extract any logs from the host os/Resin supervisor to see why those units dropped offline? One of the great advantages of Resin to us is the ability to catch any errors that may exists in our applications and act as a safety net, but this has obviously slightly concerned me as some of our units will be installed in fairly inaccessible locations so we can’t always power cycle to recover.