Host Bluetooth Connection Problems

Not sure how to verify the fix, but my bare metal PI is running this version of kernel and does not have the problem.
Linux version 4.19.75-v7+ (dom@buildbot) (gcc version 4.9.3 (crosstool-NG crosstool-ng-1.22.0-88-g8460611)) #1270 SMP Tue Sep 24 18:45:11 BST 2019

Hi Jeff,

We checked with our devices support team to ensure if the change mentioned by Rich is included into the balenaOS, and it is.

The thread is already a bit long, and sorry if I missed something, but Bluetooth communication usually requires the container to have the host network mode.
balenaSound app has an example if you check its bluetooth-audio container setup:

If you have it set already and still see the problem, I would also suggest kernel logs (with dmesg) to see if they give any extra insights.
In addition to Bluetooth-related messages, it would contain under-voltage warnings (if the device is affected by under-voltage, its connectivity functionality will be affected).

HI,

Yes I have the network_mode: host set in my docker file. I am able to access BT devices, make connections and read characteristics from inside the container, that’s not the problem. The problem is that over time (<6 hours) the container stops being able to make connections to device(s). I think the underlying problem is that one of the BT devices I am communicating w/has a habit of terminating the connection during a read (typically reading the device characteristics is when the connection fails). When that happens I see a file with the name core.### (where ### its a number) on the container file system - they are binary files. In the host log I am seeing this error when the container gets into the repeating failure loop trying to make the connection - appears the problem is w/the host making the connection -it’s holding onto dead resources for some reason. Anyone see this error on the host before? I’m attaching the dump from dmesg to this reply, you will see this error all through the log file. wiab-error-balena.log (125.9 KB)

[67642.546300] kobject_add_internal failed for hci0:64 with -EEXIST, don’t try to register things with the same name in the same directory.
[67642.558832] Bluetooth: hci0: failed to register connection device

and this is interesting, looking deeper into this error message I stumbled on this thread https://github.com/raspberrypi/linux/issues/2264. and running this cat /proc/cpuinfo on my host confirms the I am running Model : Raspberry Pi 3 Model B Rev 1.2. I’ll try reducing the baud rate /usr/bin/btuart here to determine if that reduces the error. But that file is read only protected on the FS, is there a way to edit it?

Any idea if there is another way to address this?

Hi,

It’s possible to edit the /usr/bin/btuart file by loop mounting your balena.img file. Once mounted, enter the resin-rootA partition, and in this case, find your file in the directory balena/overlay2/<id_number>/diff/usr/bin. Once that file is edited and saved, umount the image and re-flash your SD card with the updated image.

John

I’ve created a very simple test harness that uses the BluePy library to connect to a Silicon Lab Thunderboard read characteristics 30 times, disconnects and repeats. I have 2 boards and the python module I wrote reads one 30 times, then the next and repeats. If the number of characteristic reads is reduced to 1 the script will fail after <200 iterations.

I have another system reading the same boards with the same library, the reader module is running directly on the raspberry “bare metal”, no Docker or Balena. That system has been running for several weeks with no problems. With the test harness in a failed state I observed the “bare metal” raspberry also started throwing exceptions on connection attempts. I rebooted the Balena host and observed the “bare metal” system was able to start connecting to the Thunderboards without a problem.

Based on this simplified test it appears the Balena host is not properly terminating the bluetooth connections. It appears the connections are remaining open, eventually exhausting all available connections on the Thunderboard. Reboot the host releases all the stale connections and the test harness and “bare metal” system work as expected.

I’m at a loss on what could be the problem. What is the best way to properly triage at this point?

1 Like

Hi Jeff, this is something that we would like to reproduce locally, Would it be possible for you to share the test application you are using?

BluetoothTest.pdf (2.4 KB)

The attached file is actually a zip file with everything you need to create the app.

Thanks Jeff. We will update this ticker once we have some insight.

Any ideas? Were you able to reproduce the problem w/the test harness?

Hi,

We’re having our engineers look into this, but nothing yet. We hope to get back to you soon.

John

Hi, just a note that we are still investigating. Our test setup uses the application you provided just with one Silicon Labs board. We have verified the application runs without problems for over 12 hours on a Raspberry Pi4 and we are currently validating with the Raspberry Pi 3. I will update this post once we have some conclusions.

Hi again, would you mind testing with BalenaOS v2.38.0+rev1? We have done some testing and apparently the UART flow control of the bluetooth modem works much better there with the default baudrate. We are investigating what could be the cause of this but it would help if you could confirm whether there is any difference in your case.

I flashed the Thunderboards with the latest firmware the other day and have been running more tests. I continue to see the same behavior, over time the ability for the app to connect starts to degrade until it totally fails on those devices. I have been running a similar test with a 3rd Thunderboard that is physically sitting next to the PI. For that test i don’t see the # of connection fails (proximity) and as a result the tests goes on for a much longer period of time. Seems that the host OS is holding the dead connections and eventually exhausts resources and can’t connect again. Not clear if the Thunderboard is also holding the connection - scenario did not improve w/the latest firmware.

Would love to try the new OS version. I don’t see it available in my dashboard, how do i access it?
.

testrun.log (81.7 KB)

Hi, thanks for the updates and additional logs. Regarding the OS version, it is actually an older version that we would like to see if it also exhibits the issue. You should be able to download this from your dashboard (you will need to select show outdated versions to find v2.38.0).

Alternatively, you can download this version using the CLI via balena os download raspberrypi3 --version v2.38.0+rev1.prod -o ${IMG_FILENAME} - there are more details about that command here https://www.balena.io/docs/reference/balena-cli/#os-download-type. If you go that route you’ll also need to configure the image balena os configure https://www.balena.io/docs/reference/balena-cli/#os-configure-image and then either use etcher or balena os initialize https://www.balena.io/docs/reference/balena-cli/#os-initialize-image). So likely downloading from the dashboard is the best path.

Test is running now, 550 iterations and connection exceptions are within tolerance and no full failure like with the previous test so far.

4600 iterations this morning and still going strong. it appears OS 2.38 solved my problem!

12,600 and still running. I am terminating the test now, satisfied this is a solution.

What does this mean going forward?

Hi, I have also ran a long test on a Raspberry Pi 3 Model B Plus Rev 1.3 and balenaOS 2.47.0+rev1. I stopped it after over 48 hours and 3450 iterations without any error. So the problem lies with RaspberryPi3 devices before Rev 1.3, possibly because of the lack of flow control lines on the bluetooth UARTS on these models. For the time being I would advise either to stick with balenaOS v2.38 or use hardware revisions 1.3 and above. We are also tracking the problem on https://github.com/balena-os/balena-raspberrypi/issues/476 and we will update this thread with updates.

1 Like