Cellular Modem connectivity issues on boot

@shaunmulligan If I have the device in local mode when it’s starting up and wait a minute for everything to start up, the modem shows as connected. And then when I take the device out of local mode so that the application container starts, I get logs like these: https://gist.githubusercontent.com/spicybits/2f2620604b998d2fe0c33c62a8c4610b/raw/gistfile1.txt

There is a systemd-udevd event being triggered, which ModemManager picks up on, scans all ports, and drops the modem. Is that full rescan in ModemManager what you see also, but your modem doesn’t get released?

I got debug logs out of ModemManager like so:
mount -o remount,rw / vi /etc/systemd/system/dbus-org.freedesktop.ModemManager1.service
# change ExecStart line to: ExecStart=/usr/sbin/ModemManager --log-level=debug

1 Like

@liammonahan, yeah my modem doesn’t seem to ever get released. This is really strange. Could you try a really simple base image, not one of the resin.io ones (we add some additional udevd stuff in the container), perhaps if you can try push a simple Dockerfile like:

FROM armhf/busybox

CMD ["echo", "hello"]

That container shouldn’t do anything with udev or anything at all really. if it still disconnect with this, then we will know its an issue in the hostOS, but if this container works fine, then we will know its some weird interaction within the entry point of the resin.io base images.

1 Like

@shaunmulligan If I use a simple docker base image like that, then everything works. Modem comes up and stays up.

@liammonahan thats interesting, now we are getting somewhere. It seems like something in our base image entrypoint that is causing the modem to be killed. This entry.sh is the entrypoint script for most of our base images. Perhaps @petrosagg or @nghiant2710 know what processes would be sending a signal to kill the modem?

I’m able to reproduce this and think I have it at least somewhat narrowed down.

If I use resin/raspberrypi3-node (which is an older image) as my base image I do not see this issue. The CDMA connection is disabled after a push but re-enables properly when the new image starts up. If I switch to resin/raspberry-pi3-node and do a push, the CDMA connection goes down but I get the same error you were seeing when it tries to come back up, and the modem has to be unplugged and plugged back in before the connection will resume. Both of these are with ENV INITSYSTEM on.

Interestingly, if I turn off systemd (ENV INITSYSTEM off) then I see different behavior: resin/raspberry-pi3-node without systemd never deactivates the CDMA connection for me and works for both images.

@liammonahan, can you try this using resin/raspberrypi3-node (or resin/raspberrypi3-python) as a base image and see if the connection resumes after a push or container restart?

Thanks for all the assistance tracking this. Using the alternate base image (resin/raspberrypi3-python) I have no issues when the container restarts. I am having trouble with it surviving a reboot, though.

Good info, thanks! We’re digging in here to see what in the raspberry-pi3-* images is different and might be causing this.

Hello!
I’m seeing this same issue on an Intel Nuc version when a device is started up. Running “mmcli --scan-modems” will fix the issue.

edit*: I’m using the resin/intel-nuc-debian base image

Can you give us some info on what modem are you using?

A Sierra Wireless MC7354. I have two devices. One with the modem in qmi mode and the other in mbim.

  • The modem is disconnected when I start the device, update the app, or restart the app.

@mccollam can you please try the above solution to see if it works for us?

@mccollam Have you heard anything from the folks that might have the best working knowledge of changes to the entrypoint’s handling of udev in the most recent base images?

@liammonahan I was just looking back into this last night. The mmcli command that @annymsMthd suggested didn’t seem to work reliably for me. But @nghiant2710 has been trying out various changes to the base images to narrow down what is causing this. It appears to be an interaction with systemd and either dbus or possibly udev on these images.

They’re actively testing some things right now so I will keep you updated as there’s more information. This is a tricky one as it seems to manifest only under specific conditions (specific hardware and images) which you’ve managed to hit dead on. Not as lucky as winning the lottery though :confused:

@mccollam haha, well, I’ll keep buying my lottery tickets too since I seem to be lucky. Thanks for the update.

We have narrowed down what appears to be causing this issue. Apparently systemd is generating a config for a legacy sysvinit networking service. When the container shuts down, so does this service – and it doesn’t seem to leave all cellular modems in a good state when it does.

@petrosagg is working on a full fix for this, but in the meantime it seems that simply getting rid of that legacy service will allow the modem to come back online after a container update. To do this, just add the following somewhere before the final CMD line in your Dockerfile:

RUN rm /etc/rcS.d/S11networking

This should at least keep things humming along until we have a comprehensive fix.

3 Likes

Looks like this fix isn’t working with the intel nuc and the resin/intel-nuc-debian image

@liammonahan and @annymsMthd , Just to update the two of you on this issue, we have traced the issue down to this line (https://github.com/resin-io-library/base-images/blob/master/debian/armv7hf/jessie/entry.sh#L8) in our base images, so the current work around it to either completely override the entrypoint or better yet, copy this entry.sh into your project and add COPY entry.sh /usr/bin/entry.sh to your dockerfile with the above mentioned line removed.

It seems the udevadm trigger command is disconnecting the PPP interfaces when it replays the events at container startup, and some modems don’t handle the reconnect very well at all. We are looking for a nice way around this, but for now this is the recommended way forward.

3 Likes

That however, does not seem to fix the problem for resin/intel-nuc-debian. I have copied the entry.sh file, removed mentioned line, and still after update or reboot modem is not being detected / connected.

When I log in into host os and run mmcli --scan-modems as mentioned by @annymsMthd modem is detected fine and works ok.

journal -u NetworkManager
after reboot + running mmcli --scan-modems

Is there a way to keep the modem discovery going? And ask it to keep reconnecting? It is warring problem to have device on production stop responding.

I am using resin/intel-nuc-debian:latest image.

I have the same problem like llap. Did someone have any working solution ?

Hi guys,
could you add

autoconnect-retries=0

to the

[connection]

section of your NetworkManager modem config file?