We created our own image and use Balena Etcher for flashing. We use CM3 modules and the CM3 IO board for this. RPI Boot is installed properly.
It works on fresh/clean CM3. It does NOT always work on CM3 modules with an (other) image on it.
Mostly works on second or third time. The image does not boot. Does Balena change settings in the tables?
Used versions 1.5.50 and 1.5.59
There are other users here reporting failed images on other devices. Perhaps this is the same effect.
We know that there are side effects with the FLASH and partitions with FAT/FAT32 (Raspberry page talking about RPIBOOT at the bottom, images not booting properly). We fixed this to be always FAT32 (table and dump of one partition is now FAT32), but this is not the reason of the failure. Rufus always works with our image.
We presently use Rufus instead for this reason.
There also is a difference in speed:
Use rpiboot to detect CM3 -> speed in Rufus only 5MB/s+
Use Balena etcher to detect CM3 -> speed in Rufus >15MB/+.
I would like to stick to balena, because balena shows 21MB/s, but image does not always work.
Also Eject after Successful does not work here (WIN7 / 10 64bit), because image is not mounted at all after flashing (FAT32 partition). Rufus instead opens this partion after flashing as a readable drive.
Etcher does not modify the image you provide, it just copies it to the destination device.
It does NOT always work on CM3 modules with an (other) image on it.
How does it “NOT always work” ? Is there an error ? Does the flash complete ?
There also is a difference in speed:
Rpiboot and etcher use the same method: they send a firmware to the device that turns it into an usb mass storage device (like an usb stick).
The only difference is the firmware they send. The one sent by etcher allows faster write speeds.
Yes, it is bootable. It is created as a bootable image.
Tested with dd (Linux) and other windows tools (Win32DiskImager).
It nearly always works with Etcher on new CM3 out of the box. But it fails with CM3 where data is present or other partitions are present. Rufus seems to delete partitions first as one can see in the status line. Perhaps this makes the difference? That would be like you call it making it bootable by default.
Thanks, sharing as much detail as possible for the partition here would be useful. Unfortunately without an image artefact, it becomes very difficult for us to determine why your custom image doesn’t work.
Hello,
We would need a disk image that can reproduces this behaviour.
What do you mean by “But it fails with CM3 where data is present or other partitions are present.” ?
Do the write and verification succeed?
If yes, does it make the compute module boot?
If yes, can you describe what you see on the screen?
If no, is the first partition on the disk bootable? (check with fdisk).
What are the differences between a bootable CM3 and a non bootable CM3 no the disk?
We are able to reproduce the problem and it is always in the beginning of the CM3 flash (by comparison).
Its always the first 1k area, which is filled with 0x00 in a “damaged” CM3.
fdisk shows nothing in this case, because it is unable to find the 4 partitions.
I will upload the first part of the good image below. I am not familiar with the structure of images. But after diff, it shows that these entries (bytes starting at offset 0x0000 and 0x001B…- not 0x00) are always “missing” in the flash. It all reads 0x00 instead of this:
Is this the partition table?
There are a lot of zeroes after that (which should be zero) but it seems that the first block is NOT WRITTEN correctly. I do not know if Etcher simply starts with offset zero or finalizes something at the end of the process. We can reproduce this with empty CM3 and with other CM3 with former working images written by other tools. Sometimes Etcher sets this area to 0x00 and the image does not boot. Perhaps always the first 512 or 1024 bytes. The rest is O.K. And perhaps this could also be a problem of other guys with non-working images on USB sticks or…
We do not use the verify option presently, to speed up the process. But we tested this with activated verify and also found at least one CM3 with damaged start area WITHOUT verify error!
I had no time to do further testing, but I expected a failure message with activated verify already at 1% in this case. Does Etcher report verify error only after 100% read? This needs too much time.
Does “Eject after success” have any influence on the last bytes written?
Perhaps the first block should be written with a delay or something else or deleted and rewritten.
Tomorrow I will send the fdisk output (4 partitions).
ALL “damaged” images show 0x00 in the first bytes (at least 512) in CM3 flash. Fdisk is not able to resolve the 4 partition entries in this case and it is not bootable. The rest of the data is o.k. It did not matter if CM3 was “clean” or already flashed before, but already flashed CM3 show this failure more often than “fresh” CM3. In our image there is some data starting in offset 0x0000 and 0x01b0.
I will send the fdisk output of a working image.
We tested on different machines, Win7/10 32/64bit with 1.5.50 and 59.
Presently we test again with verify activated, expecting failure at 1%. We found one CM3 with this failure but without verify error. Is there a write condition after finishing, killing the first block?
Sorry for the delay. We are digging deep into this issue, as we believe it might be hitting other users as well. Our current hypothesis is the following:
Etcher flashes images by zeroing out the partition tables, omitting the first block, writing the rest, and then coming back to the first block at the end. This is a workaround to the fact that we can’t easily get exclusive access over drives on Windows (see https://www.balena.io/blog/the-perils-of-writing-disk-images-on-windows/ for more details).
What we believe is happening is that for some reason, for your image or your CM3 module, we are failing to write the first block. We are going through potential reasons for this, but in the mean-time, can you share the exact byte size of the image you are trying to flash? We believe that some of our block size arithmetic might be going off for that image in particular
I am not 100% sure, but I think using the verify option keeps this block in good shape. We did not use verify and had a lot damaged images. We used verify (and stopped it after 1%) and had no fails.
Perhaps the one with verify we found had another issue.
I do not know if eject (without verify)) also has an effect of the last block. Perhaps a caching problem (last data not written before …).
Thanks for the detailed info. You have a good point about ejecting without verification and caching. We’ll try to reproduce that case ourselves and see what we find.
Is it possible to get the alternative Raspi startup code - which allows faster speeds?
The RPIBoot default is slow (in Windows environment) and yours are a lot faster. Is it possible to use this via command line? Is it provided in the installation somewhere?
Our image is finished in 3:42min by Etcher (without verify). With RPIBoot before starting Etcher it takes 12min+!
Hey there! The custom firmware we sent in Etcher is open source, and you can find it here: https://github.com/balena-io-modules/node-raspberrypi-usbboot/tree/master/blobs. I believe you can point the original rpiboot tool to a custom firmware directory (check the options the support), but we can’t ensure this will work out of the box!