I want to remotely convert my BeagleBone Black device with a faulty eMMC running Debian to BalenaOS. I created a directory named ‘migrate’ containing a raw image, config.json, nep.sh (for killing unnecessary processes), and takeover.
(nep.sh content:
`#!/bin/bash
#Get the process IDs of essential services
ESSENTIAL_PIDS=$(pgrep -f ‘sshd|bash|sh’)
#Print all processes except the essential ones
ps aux | grep -v -E ‘sshd|bash|sh’ | awk ‘{print $2}’ | while read pid; do
if [[ ! “$ESSENTIAL_PIDS” =~ “$pid” ]]; then
echo “Killing process ID $pid”
kill -9 $pid
fi
done
`
First, I ran nep.sh. Then, I selected the SD card as the location where the image would be written and booted (sudo nano /boot/uEnv.txt). After that, I executed the takeover. The system rebooted, but the BeagleBone did not boot from the SD card. Moreover, the system on the SD card was corrupted (it did not boot even after a restart). While the BeagleBone Black with a working eMMC successfully transitioned to BalenaOS, I encountered an error when attempting this with the SD card. When I checked the SD card’s contents, 1KB Balena files were created. One question came to mind: Does the power supply to the SD card get momentarily cut off during reboot, disrupting the writing process? What is the solution?
Hi, good to hear the migration worked for the eMMC case. I have not tried a migration based on the SD card. It would be useful if you can share the logs from the failed run. Have you compared them to the success run to see where the SD card case went off the rails?
Also, see the other recent thread on a BBB conversion if you haven’t run across it already.
hi kb2ma, and thanks for your time and interest.
stage1 is successfully completed and here’s the logs from the failed run in stage2:
2024-09-03 12:28:21 INFO [takeover::stage2] Setup device ‘/dev/mmcblk0’ with offset 4194304, sizelimit 41943040 on ‘/dev/loop0’
2024-09-03 12:28:21 ERROR [takeover::stage2] Failed to transfer files to balena OS, error: Error { kind: Upstream, cause: Some(EINVAL), context: Some(“Failed to mount /dev/loop0 on /mnt/balena-part”) }
Hi, thanks for those logs. I got takeover to work, migrating from Debian 9.5, as explained below. Apologies for the delay, and let us know how it works for you.
1. Boot Debian from SD card
I wanted to emulate your setup that does not have a functional eMMC. So I setup Debian to boot from SD in two steps.
Modify /boot/uEnv.txt in the Debian image (before flashing to the device) to not automatically flash to eMMC. I had to comment out the last line in the file, like this:
After booting the device, mark the eMMC as not bootable. I used sudo fdisk /dev/mmcblk1, and then the ‘a’ command.
Those two steps worked fine. I could boot the device to the SD card without having to press the boot switch when power up.
2. Run takeover with the flash-to option
I used the command below to run takeover. The -f is the short form of the flash-to option. I did not use a script like your nep.sh, I just manually stopped Apache2 with sudo systemctl stop apache2.
The end result is that the device provisioned as expected. It took around 6 minutes after starting the command for the device showed up in the dashboard.
Thank you for your response. In the meantime, I’ve conducted various experiments and encountered the following: I believe that the takeover process actually requires two different storage devices. Here’s why I think so:
When I plug in a flash drive (sda1) and a SD card (that means i’ve 2 diffrent storage -sd card and flash drive-) to log the records for a BeagleBone Black with a faulty eMMC and then run the command ./takeover -c config.json -i beaglebone-black-5.3.4+rev3-v16.1.10.img.gz --no-nwmgr-check --no-wifis --log-file /root/migrate-open/stage1.log --log-level trace -l /dev/sda1 --s2-log-level trace, it works correctly and the migration is successful.
However, when I don’t plug in a flash drive(-remains only sd card for storage-) and remove the logging options, using the command ./takeover -c config.json -i beaglebone-black-5.3.4+rev3-v16.1.10.img.gz --no-nwmgr-check --no-wifis, the migration fails.
Similarly, with a BeagleBone Black that has a functioning eMMC, migration fails without an SD card or flash drive for logging (that means i’ve only mmcblk1 for storage). However, when I plug in external storage for logs, the migration succeeds.
I can test this with the BeagleBones I have on hand, but for the ones that are remote and to which I have no physical access, I need to perform the takeover remotely.
Could you try a takeover process without logging? For example: ./takeover -c config.json -i beaglebone-black-5.3.4+rev3-v16.1.10.img.gz --no-nwmgr-check --no-wifis (By the way, I believe logging is optional, but if I’m wrong, please let me know).
At the moment, I’m looking into the source code of the takeover script and trying to understand the logging processes. If I find an issue, I will try to create a new script using a cross-compiler and cargo.
Good to see you’re continuing to experiment. I have not previously heard of a problem with running takeover without logging. AIUI logging is optional.
I started with the same setup as step 1 in my last report, where the device runs Debian from the SD card. Then I ran takeover with the command below, and it succeeded. The device appeared on the dashboard after 4 1/2 minutes.
I set log-level to ‘debug’ to see more information from the stage 1 log on the terminal. As you can see I also still use the flash-to option. I recommend always using that option on a Beaglebone if the goal is to run from the SD card.
Hi,
I followed the advice you gave, but the issue still persists. i modified /boot/uEnv.txt first. Then the command ~ $ sudo fdisk /dev/mmcblk1
Welcome to fdisk (util-linux 2.29.2).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.
fdisk: cannot open /dev/mmcblk1: No such file or directory
did not work because, as you can see, mmcblk1 was not present (~ $ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
mmcblk0 179:0 0 29.7G 0 disk
`-mmcblk0p1 179:1 0 7.5G 0 part /
). Then, I tried using the --flash-to command (./takeover -f /dev/mmcblk0 -c first-fleet.config.json -i balena-cloud-first-fleet-beaglebone-black-5.3.4+rev3-v16.1.10.img.gz --no-nwmgr-check --no-wifis --log-level debug). There was no issue with stage1 as mentioned in the attachment. stage1.log (9.9 KB)
I wanted to check the output from the serial port and encountered another issue, and the BeagleBone did not boot. Here is the output:
[ 0.113346] l3-aon-clkctrl:0000:0: failed to disable
[ 2.523247] debugfs: Directory '49000000.dma' with parent 'dmaengine' already present!
[ 2.789837] omap_voltage_late_init: Voltage driver support not added
Starting version 250.5+
[ 13.967826] systemd[710]: /lib/systemd/system-generators/systemd-gpt-auto-generator failed with exit status 1.
[ 28.667318] davinci-mcasp 48038000.mcasp: IRQ common not found
I assume that your last message is based on a device with a non-functional eMMC. If so, it makes sense that fdisk can’t access it to deactivate the bootable flag.
On my Beaglebone migrated to balenaOS, here is what I see via serial when booting:
Trying to boot from MMC1
Loading Environment from EXT4...
** Unable to use mmc 0:1 for loading the env **
--- many blank lines ---
[ 0.113412] l3-aon-clkctrl:0000:0: failed to disable
[ 2.517021] debugfs: Directory '49000000.dma' with parent 'dmaengine' already present!
[ 2.774302] omap_voltage_late_init: Voltage driver support not added
Starting version 250.5+
[ 20.222574] systemd[831]: /lib/systemd/system-generators/systemd-gpt-auto-generator failed with exit status 1.
[ 35.226105] davinci-mcasp 48038000.mcasp: IRQ common not found
balenaOS 5.3.4+rev3 localhost ttyS0
localhost login: root
Last login: Fri Sep 13 21:35:23 +0000 2024 on pts/1 from 192.168.1.100.
root@e898fb4:~#
So my output looks similar to yours, except you don’t get the boot prompt. Is your login attempt on the device after the migration?
I agree your stage1 log looks OK. I can’t provide any more guidance. My only guess is that somehow the dysfunctional eMMC is disrupting the takeover or boot.
One other idea to try. If you have success only when specifying a stage2 log target, try using -l /dev/null. Hopefully the target won’t matter.