Unable to complete Getting Started on Intel NUC

Hey there, I’ve been trying to get my feet wet with BalenaOS.

I just got myself an Intel NUC, and I’ve been trying to follow the instructions at https://www.balena.io/os/docs/intel-nuc/getting-started/ using the current (2.50.1 rev1) Intel NUC dev image. While not everything happens preceisely as described, things seem to work up until the push.

When I try to “balena push balena.local”, I get

Could not communicate with local mode device at address 192.168.1.113                    
                                                                                         
Additional information may be available with the `--debug` flag.                         
For help, visit our support forums: https://forums.balena.io                             
For bug reports or feature requests, see: https://github.com/balena-io/balena-cli/issues/

And with the --debug flag I get this:

Missing applicationOrDevice

Error: Missing applicationOrDevice
    at C:\Program Files\balena-cli\client\node_modules\capitano\build\signature.js:150:29
    at C:\Program Files\balena-cli\client\node_modules\capitano\node_modules\async\lib\async.js:181:20
    at Immediate.iterate (C:\Program Files\balena-cli\client\node_modules\capitano\node_modules\async\lib\async.js:262:13)
    at runCallback (timers.js:705:18)
    at tryOnImmediate (timers.js:676:5)
    at processImmediate (timers.js:658:5)
    at process.topLevelDomainCallback (domain.js:126:23)

Looking at some older posts, I logged in using ssh and ran a few commands. Firstly, I noticed that when I asked for the “balena-engine version”, there seemed to be a problem with the engine daemon:

Client:
 Version:           18.09.17-dev
 API version:       1.39
 Go version:        go1.10.8
 Git commit:        2ab17e0536b6a4528b33c75e8f350447e9882af0
 Built:             Mon May 11 15:17:45 2020
 OS/Arch:           linux/amd64
 Experimental:      false
Cannot connect to the balenaEngine daemon at unix:///var/run/balena-engine.sock. Is the balenaEngine daemon running?

Then, running some commands from an earlier post:

root@balena:~# balena-engine ps -a
Cannot connect to the balenaEngine daemon at unix:///var/run/balena-engine.sock. Is the balenaEngine daemon running?
root@balena:~# journalctl -u resin-supervisor --no-pager
-- Logs begin at Tue 2020-07-21 23:28:47 UTC, end at Tue 2020-07-21 23:56:45 UTC. --
Jul 21 23:35:21 localhost systemd[1]: Dependency failed for Balena supervisor.
Jul 21 23:35:21 localhost systemd[1]: resin-supervisor.service: Job resin-supervisor.service/start failed with result 'dependency'.
Jul 21 23:43:41 balena systemd[1]: Dependency failed for Balena supervisor.
Jul 21 23:43:41 balena systemd[1]: resin-supervisor.service: Job resin-supervisor.service/start failed with result 'dependency'.

So, I’m hoping that someone can give me a hand, and help me to work out what’s up.

(A bit of honesty, I’ve also tried using Balena Cloud, but been unable to get past post provisioning; I’ve returned to this, as it seems more self-contained.)

Update: I have also noticed that when I restart the machine (after doing a “shutdown now” from the ssh logon), it still takes a very lonig time doing the “expand resin-data partition” step (perhaps half an hour or so). Is that to be expected? It feels like this is indicating something hasn’t been fully set up at this point.

Take care,

Caligari

Have a read through https://www.balena.io/docs/learn/develop/local-mode/
It should have all the information you need to get local development working.

One thing i’ll point out which you haven’t mentioned but is in the documentation I linked is:

Local mode must be enabled through the balenaCloud dashboard. You can enable it from either the Actions menu of the device dashboard or click to expand the arrow located on the device dashboard and select Enable Local Mode.

Hope this helps in any way :ok_hand:

Thank you!

I did find that page earlier, and that’s what got me on to trying to do the getting started process for Cloud, so that I could get a device showing up on my dashboard, so that it could be put into local mode. I didn’t succeed at that, but I guess I will try again.

This implies that the Getting Started page I referred to, above, is completely misleading. That would probably mean that it should be taken down, or a clear note to that effect put at the top.

Anyway, I’ll try the Cloud process again (and head over to that part of the forums if I run into trouble).

Take care,

Caligari

hey @Caligari,

Welcome to the forums! Thanks for the detailed feedback.

While not everything happens preceisely as described, things seem to work up until the push.

Would be grateful if you can point out the discrepancies and we will see if we need to update our docs.

Which application are you trying to push to the device? are you using the multicontainer demo?

As for the balena engine not starting, that doesn’t seem right. I’m asking my colleagues about this and will get back to you.

Also, considering you are using standalone balenaOS(not with balenaCloud), this means you won’t be able to control it from the dashboard.

I didn’t succeed at that

Can you please descibe where you struggled to get your device online on balenaCloud?

Could you also perhaps collect some logs for balena-engine?
journalctl -u balena-engine --no-pager ?

I’ll start with the minor discrepancies, while I get things back to the way they were.

Once the download is finished, make sure to unzip the file and keep the resulting balena.img somewhere safe

The zip and image both have a longer name, with the platform, version, etc in them.

Open balenaEtcher and use the blue “Select Image” button

The button is now labeled “Flash from file” in the Windows version I downloaded, and it isn’t blue until I mouse-over.

insert your SD card or connect your device (in the case of a balenaFin)

The section is called Flash USB drive, but flashing to a USB device isn’t mentioned in the text, only SD and attaching a balenaFin.

Now power on and boot up your device, after about 10 seconds or so your device should be up and connected to your local network

Defiitely a much abbreviated description -

  1. Boot with flashed media, after a bit more than 10 seconds (maybe twice) and a full-screen balena logo, the device turns off
  2. Remove the the flashed media, and turn the device on again
  3. There is a lot off boot chatter, and then things wait for a long time (maybe 10 minutes?) at the “expand resin-data partition” line.
  4. Very suddenly the boot continues, the screen is cleared and the ASCII balenaOS logo is shown.

Note: at this point the screen says “Booted - Check your balenaCloud dashboard” although I might well have no idea about Cloud, and wasn’t using it to get here.

Note: it would be worth mentioning, for those using ethernet, that the mydevice name used throughout should be balena, instead, a reminder of a throw-away mention earlier.

we can just ssh in

This is an over-simplification for newbies, and it would be worth explaining the need for keys, or at least referencing that here. I cannot recall, but I might have had to login to Cloud (and thus create an account etc to a different system entirely) to get my keys easily? It might also be worth noting here that you cannot use another ssh client to make the connection (or at least not trivially).

root@mydevice:~# uname -a

It might be worth putting a more relevant (to the hardware I’m reading about) response here.

… and that is where I get to. Now to look for the info we want (next response).

I should point out that these are minor details. But when you are new, they do lead to some concern. The installation of the CLI and fiddling around with Bonjour and the other details were a lot more than I was expecting when I started, and when things were not working, every deviation seemed like I might be seeing something wrong, until I checked.

Overall, though, I love the clean and clear way the page is expressed.

Take care,

Caligari

From my rambling list of details, just above, you can see I don’t really get to the point of putting any application on to the device, but I am following the Getting Started, so I have downloaded the multicontainer example, as suggested.

As to balenaCloud, I believe you found my post over there about the problems I am having with that attempt, so I’ll leave that aside for the moment.

So, I have run a “balena-engine version”, and been told it can’t find the daemon.

Now, journalctl -u balena-engine --no-pager gets me:
– Logs begin at Wed 2020-07-22 10:51:30 UTC, end at Wed 2020-07-22 11:18:50 UTC. –
– No entries –

Which is less than I was hoping for! :slight_smile:

Is there more that would help?

Take care,

Caligari

Hello, I think it’s journalctl -u balena --no-pager, not balena-engine.
You may also want to check all logs with: journalctl -a --no-pager.

OK, that makes sense. The -u balena gets us:

Jul 22 10:58:11 localhost systemd[1]: Dependency failed for Balena Application Container Engine.               
Jul 22 10:58:11 localhost systemd[1]: balena.service: Job balena.service/start failed with result 'dependency'.
Jul 22 11:06:25 balena systemd[1]: Dependency failed for Balena Application Container Engine.                  
Jul 22 11:06:25 balena systemd[1]: balena.service: Job balena.service/start failed with result 'dependency'. 

Repeated a bit (all with the same timestamp, though).

With -a we get more. I’m going to have to work out how to collect it all. But in the meantime, I notice this:

Jul 22 10:58:10 localhost sh[732]: Resizing the filesystem on /dev/disk/by-state/resin-data to 1952801880 (1k) blocks.
Jul 22 10:58:10 localhost sh[732]: The filesystem on /dev/disk/by-state/resin-data is now 1952801880 (1k) blocks long.
Jul 22 10:58:11 localhost kernel: EXT4-fs (sda6): ext4_check_descriptors: Block bitmap for group 131056 not in group (block 35184616189952)!
Jul 22 10:58:11 localhost kernel: EXT4-fs (sda6): group descriptors corrupted!
Jul 22 10:58:11 localhost resin-partition-mounter[747]: mount: /run/tmp.0engmepL9V: mount(2) system call failed: Structure needs cleaning.
Jul 22 10:58:11 localhost resin-partition-mounter[747]: umount: /run/tmp.0engmepL9V: not mounted.
Jul 22 10:58:11 localhost resin-partition-mounter[747]: resin-data: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
Jul 22 10:58:11 localhost resin-partition-mounter[747]:         (i.e., without -a or -p options)
Jul 22 10:58:11 localhost resin-partition-mounter[747]: WARN: UUIDs not regenerated - retry on next boot
Jul 22 10:58:11 localhost resin-partition-mounter[747]: INFO: Mounting /dev/disk/by-state/resin-data (resin-data) in /mnt/data.

Which keeps coming up. That seems like a red flag…

Take care,

Caligari

This looks bad.
Maybe the flashing went bad, maybe the storage device has issues.
Try reflashing the device.

Hmmm. Just pointing out that this is the fourth time I’ve tried this installation, so far. I doubt that the flash has gone wrong the same way all four times (across two different USBs - I changed media in case that was the problem)…

This is a brand new set of hardware: the NUC, the RAM, and the HDD. If there is an indication the HDD is bad, is there a quick way I can check that, and see if I need a replacement?

I wonder if I should try an installation of some other OS entirely, to see if any problems are detected? Any other suggestions?

Take care,

Caligari

On the device, you can try running dmesg and check if you see anything bad related to the hard drive.
If you enable support access for the device and provide its uuid, I can have a look.

The only thing I can see in the dmesg that is concerning is the same error. Before that it seems to be reasonable.

Forgive my ignorance, but in order to enable support access I’d have to get the device running through the Cloud, wouldn’t I? And perhaps be a paying customer? I haven’t had success getting the device past post-provisioning through the Cloud (probably for these reasons)…

Take care,

Caligari

Forgive my ignorance, but in order to enable support access I’d have to get the device running through the Cloud.

Yes, sorry, I didn’t see it didn’t get past post-provisionning

Normally,

  • you flash the downloaded image on a usb drive;
  • boot the nuc from that drive;
  • wait until the nuc shuts down, remove the usb drive and power the nuc;
  • the nuc will boot and expand the partitions to fill the whole internal drive (that takes some time) and reboot;
  • at this point it should appear on the dashboard

How are you accessing the device ?

I’m using balena ssh to get to it. This is all following the process in the Getting Started for the Intel NUC, which seemed as though it was working, up until the push of the multicontainer demo, which wasn’t working… but it did get to the point where I could ping and ssh to it.

Take care,

Caligari

Before pushing apps, we need the balena daemon running.
If balena ps doesn’t work, pushing apps won’t work either.
Can you please provide the full logs from journalctl -a ?

balena_logs.log (124.9 KB)

Ah sorry about that! I had the logs here, but didn’t upload.

Take care,

Caligari

Can you try running fsck /dev/sda6 ?

I happy to report that I decided that would be a good plan, and have just finished doing exactly that.

It found problems, which I agreed that it should repair.

I requested a reboot from the ssh session, which it did.

That has just completed, and it has arrived at the same point. By which I mean that it has the same error message, with the same group and block listed. I had to check that the timestamps were different and I wasn’t just seeing the same log…

This is strange, do you have another hard drive you could try this with ?