Cellular Modem connectivity issues on boot


#11

I made a small config change in the meantime because it looked like the primary DNS server being assigned through ppp is unresponsive.

Started using something like this:

[ipv4]
method=auto
dns=8.8.8.8;
ignore-auto-dns=true

so that the bogus DNS nameservers returned by ppp-manager will not be used by NetworkManager. That, however, did not help. It definitely looks like openvpn cannot resolve vpn.resin.io.

Would there be much difference if I used a resinos.io image instead? Are they fulfilling similar roles now that -dev images are available through the dashboard?


#12

More datapoints: If I provision a brand new device (with ethernet and modem connected) and put it into local mode before it has a chance to download the application (it’s at the factory build) it will list the modem in nmcli indefinitely. As soon as I go and disable local mode and the application downloads and starts, the modem connection disappears from nmcli. There are no logs about what’s happening in journalctl.


#13

A note on resinos.io images: those are currently behind in versions, and the 2.0 images you get from the dashboard are newer. Once the final (non beta) release is out on resin.io, will update the open source resinos.io images as well. Currently, though, would not use that, lot has changed since then. The similar roles you mentioned are a thing, though, that’s one reason rdt is being rolled into resin-cli (and the resin local ... commands).

Thanks for the additional info, we are trying to narrow down what might be happening. It’s very strange, because there are a large number of cellular modems that work totally fine…


#14

By the way, would you consider use any other modem, or this one has any specific features that you need?


#15

@liammonahan if the connection stays up indefinitely with out ever running the container, then I suspect there is some kind of interaction with a service running in the container. Do you have INITSYSTEM=on set in all the projects you are testing? I am wondering if somehow the systemd in the container is causing some kind of killing of the modemmanager service. I’ll try do some testing with this today on the some of our latest base images, because I wonder if a recent change has introduced this behaviour.


#16

Hey Liam, I spent some time trying to reproduce this today, but haven’t been able to get the same behaviour with any of the modems I have here. It’s pretty strange, I am wondering if perhaps its something to do with the cdma modems as I am based in Europe, so have only ever tested GSM modems. Perhaps @mccollam has had some success with his CDMA based modems.

It’s also interesting to see these two lines:

Mar 06 12:08:13 e5882e6 pppd[856]: Terminating on signal 15
Mar 06 12:08:13 e5882e6 NetworkManager[723]: Terminating on signal 15

Which seem like something is sending SIGTERM to the pppd and NM. This I suspect is the systemd process in your container when it shutdown during a application stop while updating. So perhaps having INITSYSTEM=off will stop the issue from happening?


Modem Manager GPS issue
#17

@imrehg We’ve got some different modems we ordered that should get activated in the next couple of days to test out. We are not wedded to this modem, but it seems robust and has worked for us with NetworkManager in the past.


#18

I didn’t have any luck unfortunately using INITSYSTEM=off. The main difference this time is pppd exiting much more quickly (0.1 minutes instead of 1.5?) and the following log messages at the very end of the log. TERM signal is still being sent to MM/NM. Hmmm…

ModemManager[616]: (ttyUSB0): port attributes not fully set
ModemManager[616]: Couldn’t create modem for device at ‘/sys/devices/platform/soc/3f980000.usb/usb1/1-1/1-1.2/1-1.2.4’: Failed to find primary AT port

Log with INITSYSTEM=off: https://gist.githubusercontent.com/spicybits/4e58292d998cc3c6fc73f5126b18dc16/raw/ae285771d8d053bf314c8f1c12bf1d1ac924dc03/log


#19

The next thing I’m going to try is to see if I can confirm what’s sending a TERM signal through strace and auditd.


#20

Any luck w/ different CDMA modems @mccollam? Any issues/gotchas you ran into along the way? Thanks.


#21

@shaunmulligan If I have the device in local mode when it’s starting up and wait a minute for everything to start up, the modem shows as connected. And then when I take the device out of local mode so that the application container starts, I get logs like these: https://gist.githubusercontent.com/spicybits/2f2620604b998d2fe0c33c62a8c4610b/raw/gistfile1.txt

There is a systemd-udevd event being triggered, which ModemManager picks up on, scans all ports, and drops the modem. Is that full rescan in ModemManager what you see also, but your modem doesn’t get released?

I got debug logs out of ModemManager like so:
mount -o remount,rw / vi /etc/systemd/system/dbus-org.freedesktop.ModemManager1.service
# change ExecStart line to: ExecStart=/usr/sbin/ModemManager --log-level=debug


#22

@liammonahan, yeah my modem doesn’t seem to ever get released. This is really strange. Could you try a really simple base image, not one of the resin.io ones (we add some additional udevd stuff in the container), perhaps if you can try push a simple Dockerfile like:

FROM armhf/busybox

CMD ["echo", "hello"]

That container shouldn’t do anything with udev or anything at all really. if it still disconnect with this, then we will know its an issue in the hostOS, but if this container works fine, then we will know its some weird interaction within the entry point of the resin.io base images.


Cellular Hologram.io issues
#23

@shaunmulligan If I use a simple docker base image like that, then everything works. Modem comes up and stays up.


#24

@liammonahan thats interesting, now we are getting somewhere. It seems like something in our base image entrypoint that is causing the modem to be killed. This entry.sh is the entrypoint script for most of our base images. Perhaps @petrosagg or @nghiant2710 know what processes would be sending a signal to kill the modem?


#25

I’m able to reproduce this and think I have it at least somewhat narrowed down.

If I use resin/raspberrypi3-node (which is an older image) as my base image I do not see this issue. The CDMA connection is disabled after a push but re-enables properly when the new image starts up. If I switch to resin/raspberry-pi3-node and do a push, the CDMA connection goes down but I get the same error you were seeing when it tries to come back up, and the modem has to be unplugged and plugged back in before the connection will resume. Both of these are with ENV INITSYSTEM on.

Interestingly, if I turn off systemd (ENV INITSYSTEM off) then I see different behavior: resin/raspberry-pi3-node without systemd never deactivates the CDMA connection for me and works for both images.

@liammonahan, can you try this using resin/raspberrypi3-node (or resin/raspberrypi3-python) as a base image and see if the connection resumes after a push or container restart?


#26

Thanks for all the assistance tracking this. Using the alternate base image (resin/raspberrypi3-python) I have no issues when the container restarts. I am having trouble with it surviving a reboot, though.


Preloading app + cellular modem config don't work together
#27

Good info, thanks! We’re digging in here to see what in the raspberry-pi3-* images is different and might be causing this.


#28

Hello!
I’m seeing this same issue on an Intel Nuc version when a device is started up. Running “mmcli --scan-modems” will fix the issue.

edit*: I’m using the resin/intel-nuc-debian base image


#29

Can you give us some info on what modem are you using?


#30

A Sierra Wireless MC7354. I have two devices. One with the modem in qmi mode and the other in mbim.

  • The modem is disconnected when I start the device, update the app, or restart the app.