Cellular Modem connectivity issues on boot

@imrehg We’ve got some different modems we ordered that should get activated in the next couple of days to test out. We are not wedded to this modem, but it seems robust and has worked for us with NetworkManager in the past.

I didn’t have any luck unfortunately using INITSYSTEM=off. The main difference this time is pppd exiting much more quickly (0.1 minutes instead of 1.5?) and the following log messages at the very end of the log. TERM signal is still being sent to MM/NM. Hmmm…

ModemManager[616]: (ttyUSB0): port attributes not fully set
ModemManager[616]: Couldn’t create modem for device at ‘/sys/devices/platform/soc/3f980000.usb/usb1/1-1/1-1.2/1-1.2.4’: Failed to find primary AT port

Log with INITSYSTEM=off: https://gist.githubusercontent.com/spicybits/4e58292d998cc3c6fc73f5126b18dc16/raw/ae285771d8d053bf314c8f1c12bf1d1ac924dc03/log

The next thing I’m going to try is to see if I can confirm what’s sending a TERM signal through strace and auditd.

Any luck w/ different CDMA modems @mccollam? Any issues/gotchas you ran into along the way? Thanks.

@shaunmulligan If I have the device in local mode when it’s starting up and wait a minute for everything to start up, the modem shows as connected. And then when I take the device out of local mode so that the application container starts, I get logs like these: https://gist.githubusercontent.com/spicybits/2f2620604b998d2fe0c33c62a8c4610b/raw/gistfile1.txt

There is a systemd-udevd event being triggered, which ModemManager picks up on, scans all ports, and drops the modem. Is that full rescan in ModemManager what you see also, but your modem doesn’t get released?

I got debug logs out of ModemManager like so:
mount -o remount,rw / vi /etc/systemd/system/dbus-org.freedesktop.ModemManager1.service
# change ExecStart line to: ExecStart=/usr/sbin/ModemManager --log-level=debug

1 Like

@liammonahan, yeah my modem doesn’t seem to ever get released. This is really strange. Could you try a really simple base image, not one of the resin.io ones (we add some additional udevd stuff in the container), perhaps if you can try push a simple Dockerfile like:

FROM armhf/busybox

CMD ["echo", "hello"]

That container shouldn’t do anything with udev or anything at all really. if it still disconnect with this, then we will know its an issue in the hostOS, but if this container works fine, then we will know its some weird interaction within the entry point of the resin.io base images.

1 Like

@shaunmulligan If I use a simple docker base image like that, then everything works. Modem comes up and stays up.

@liammonahan thats interesting, now we are getting somewhere. It seems like something in our base image entrypoint that is causing the modem to be killed. This entry.sh is the entrypoint script for most of our base images. Perhaps @petrosagg or @nghiant2710 know what processes would be sending a signal to kill the modem?

I’m able to reproduce this and think I have it at least somewhat narrowed down.

If I use resin/raspberrypi3-node (which is an older image) as my base image I do not see this issue. The CDMA connection is disabled after a push but re-enables properly when the new image starts up. If I switch to resin/raspberry-pi3-node and do a push, the CDMA connection goes down but I get the same error you were seeing when it tries to come back up, and the modem has to be unplugged and plugged back in before the connection will resume. Both of these are with ENV INITSYSTEM on.

Interestingly, if I turn off systemd (ENV INITSYSTEM off) then I see different behavior: resin/raspberry-pi3-node without systemd never deactivates the CDMA connection for me and works for both images.

@liammonahan, can you try this using resin/raspberrypi3-node (or resin/raspberrypi3-python) as a base image and see if the connection resumes after a push or container restart?

Thanks for all the assistance tracking this. Using the alternate base image (resin/raspberrypi3-python) I have no issues when the container restarts. I am having trouble with it surviving a reboot, though.

Good info, thanks! We’re digging in here to see what in the raspberry-pi3-* images is different and might be causing this.

Hello!
I’m seeing this same issue on an Intel Nuc version when a device is started up. Running “mmcli --scan-modems” will fix the issue.

edit*: I’m using the resin/intel-nuc-debian base image

Can you give us some info on what modem are you using?

A Sierra Wireless MC7354. I have two devices. One with the modem in qmi mode and the other in mbim.

  • The modem is disconnected when I start the device, update the app, or restart the app.

@mccollam can you please try the above solution to see if it works for us?

@mccollam Have you heard anything from the folks that might have the best working knowledge of changes to the entrypoint’s handling of udev in the most recent base images?

@liammonahan I was just looking back into this last night. The mmcli command that @annymsMthd suggested didn’t seem to work reliably for me. But @nghiant2710 has been trying out various changes to the base images to narrow down what is causing this. It appears to be an interaction with systemd and either dbus or possibly udev on these images.

They’re actively testing some things right now so I will keep you updated as there’s more information. This is a tricky one as it seems to manifest only under specific conditions (specific hardware and images) which you’ve managed to hit dead on. Not as lucky as winning the lottery though :confused:

@mccollam haha, well, I’ll keep buying my lottery tickets too since I seem to be lucky. Thanks for the update.

We have narrowed down what appears to be causing this issue. Apparently systemd is generating a config for a legacy sysvinit networking service. When the container shuts down, so does this service – and it doesn’t seem to leave all cellular modems in a good state when it does.

@petrosagg is working on a full fix for this, but in the meantime it seems that simply getting rid of that legacy service will allow the modem to come back online after a container update. To do this, just add the following somewhere before the final CMD line in your Dockerfile:

RUN rm /etc/rcS.d/S11networking

This should at least keep things humming along until we have a comprehensive fix.

3 Likes

Looks like this fix isn’t working with the intel nuc and the resin/intel-nuc-debian image