There is a systemd-udevd event being triggered, which ModemManager picks up on, scans all ports, and drops the modem. Is that full rescan in ModemManager what you see also, but your modem doesn’t get released?
I got debug logs out of ModemManager like so: mount -o remount,rw /
vi /etc/systemd/system/dbus-org.freedesktop.ModemManager1.service
# change ExecStart line to: ExecStart=/usr/sbin/ModemManager --log-level=debug
@liammonahan, yeah my modem doesn’t seem to ever get released. This is really strange. Could you try a really simple base image, not one of the resin.io ones (we add some additional udevd stuff in the container), perhaps if you can try push a simple Dockerfile like:
FROM armhf/busybox
CMD ["echo", "hello"]
That container shouldn’t do anything with udev or anything at all really. if it still disconnect with this, then we will know its an issue in the hostOS, but if this container works fine, then we will know its some weird interaction within the entry point of the resin.io base images.
@liammonahan thats interesting, now we are getting somewhere. It seems like something in our base image entrypoint that is causing the modem to be killed. This entry.sh is the entrypoint script for most of our base images. Perhaps @petrosagg or @nghiant2710 know what processes would be sending a signal to kill the modem?
I’m able to reproduce this and think I have it at least somewhat narrowed down.
If I use resin/raspberrypi3-node (which is an older image) as my base image I do not see this issue. The CDMA connection is disabled after a push but re-enables properly when the new image starts up. If I switch to resin/raspberry-pi3-node and do a push, the CDMA connection goes down but I get the same error you were seeing when it tries to come back up, and the modem has to be unplugged and plugged back in before the connection will resume. Both of these are with ENV INITSYSTEM on.
Interestingly, if I turn off systemd (ENV INITSYSTEM off) then I see different behavior: resin/raspberry-pi3-node without systemd never deactivates the CDMA connection for me and works for both images.
@liammonahan, can you try this using resin/raspberrypi3-node (or resin/raspberrypi3-python) as a base image and see if the connection resumes after a push or container restart?
Thanks for all the assistance tracking this. Using the alternate base image (resin/raspberrypi3-python) I have no issues when the container restarts. I am having trouble with it surviving a reboot, though.
@mccollam Have you heard anything from the folks that might have the best working knowledge of changes to the entrypoint’s handling of udev in the most recent base images?
@liammonahan I was just looking back into this last night. The mmcli command that @annymsMthd suggested didn’t seem to work reliably for me. But @nghiant2710 has been trying out various changes to the base images to narrow down what is causing this. It appears to be an interaction with systemd and either dbus or possibly udev on these images.
They’re actively testing some things right now so I will keep you updated as there’s more information. This is a tricky one as it seems to manifest only under specific conditions (specific hardware and images) which you’ve managed to hit dead on. Not as lucky as winning the lottery though
We have narrowed down what appears to be causing this issue. Apparently systemd is generating a config for a legacy sysvinit networking service. When the container shuts down, so does this service – and it doesn’t seem to leave all cellular modems in a good state when it does.
@petrosagg is working on a full fix for this, but in the meantime it seems that simply getting rid of that legacy service will allow the modem to come back online after a container update. To do this, just add the following somewhere before the final CMD line in your Dockerfile:
RUN rm /etc/rcS.d/S11networking
This should at least keep things humming along until we have a comprehensive fix.
It seems the udevadm trigger command is disconnecting the PPP interfaces when it replays the events at container startup, and some modems don’t handle the reconnect very well at all. We are looking for a nice way around this, but for now this is the recommended way forward.
That however, does not seem to fix the problem for resin/intel-nuc-debian. I have copied the entry.sh file, removed mentioned line, and still after update or reboot modem is not being detected / connected.
When I log in into host os and run mmcli --scan-modems as mentioned by @annymsMthd modem is detected fine and works ok.
journal -u NetworkManager
after reboot + running mmcli --scan-modems
Is there a way to keep the modem discovery going? And ask it to keep reconnecting? It is warring problem to have device on production stop responding.