GSM device goes offline after about 8 hours

Hello,

I am testing balena on a CM3 based device with a Huawei 909 modem. The device goes offline after a couple of hours and does not recover. When I reboot, the device is online again. I had another topic opened already but created a new one to provide a compact description of the situation.

We have another edge setup without balena using pppd and wvdial and there the devices always recover from a disconnection. I have also swapped the Modem to make sure it’s not faulty hardware. That software stack is based on Raspbian. We do not use NetworkManager nor ModemManager.

In the balena setup I have persistent logging activated. I have set autoconnect-retries to 0 in the NetworkManager config.

I have now connected to the device using ethernet and I am able to connect. The device is offline for about 1 hour. This is what I see:

journalctl -u NetworkManager
-- Logs begin at Thu 2019-12-05 10:40:56 UTC, end at Thu 2019-12-05 10:59:37 UTC. --                                                                                                                            
Dec 05 10:41:14 9d52738 NetworkManager[805]: <info>  [1575542474.6706] manager: (vethf926bdf): new Veth device (/org/freedesktop/NetworkManager/Devices/288)                                                    
Dec 05 10:41:14 9d52738 NetworkManager[805]: <info>  [1575542474.7368] device (vethf9903b3): released from master device br-4a4858757201                                                                        
Dec 05 10:41:15 9d52738 NetworkManager[805]: <info>  [1575542475.0989] manager: (veth85fe8be): new Veth device (/org/freedesktop/NetworkManager/Devices/289)                                                    
Dec 05 10:41:15 9d52738 NetworkManager[805]: <info>  [1575542475.1049] manager: (veth5fb4553): new Veth device (/org/freedesktop/NetworkManager/Devices/290)                                                    
Dec 05 10:41:16 9d52738 NetworkManager[805]: <info>  [1575542476.5367] device (veth5fb4553): carrier: link connected                                                                                            
Dec 05 10:41:44 9d52738 NetworkManager[805]: <info>  [1575542504.3336] manager: (veth85fe8be): new Veth device (/org/freedesktop/NetworkManager/Devices/291)                                                    
Dec 05 10:41:44 9d52738 NetworkManager[805]: <info>  [1575542504.3782] device (veth5fb4553): released from master device br-4a4858757201                                                                        
Dec 05 10:41:44 9d52738 NetworkManager[805]: <info>  [1575542504.7430] manager: (veth61c756d): new Veth device (/org/freedesktop/NetworkManager/Devices/292)                                                    
Dec 05 10:41:44 9d52738 NetworkManager[805]: <info>  [1575542504.7477] manager: (vethae437ec): new Veth device (/org/freedesktop/NetworkManager/Devices/293)                                                    
Dec 05 10:41:46 9d52738 NetworkManager[805]: <info>  [1575542506.2366] device (vethae437ec): carrier: link connected                                                                                            
...
Dec 05 10:57:15 9d52738 NetworkManager[805]: <info>  [1575543435.0536] device (veth3e99d56): released from master device br-4a4858757201                                                                        
Dec 05 10:57:15 9d52738 NetworkManager[805]: <info>  [1575543435.4485] manager: (veth77b7824): new Veth device (/org/freedesktop/NetworkManager/Devices/385)                                                    
Dec 05 10:57:15 9d52738 NetworkManager[805]: <info>  [1575543435.4556] manager: (vethef8705c): new Veth device (/org/freedesktop/NetworkManager/Devices/386)                                                    
Dec 05 10:57:16 9d52738 NetworkManager[805]: <info>  [1575543436.9269] device (vethef8705c): carrier: link connected                                                                                            
Dec 05 10:57:37 9d52738 NetworkManager[805]: <info>  [1575543457.7945] device (eth0): carrier: link connected                                                                                                   
Dec 05 10:57:37 9d52738 NetworkManager[805]: <info>  [1575543457.7973] device (eth0): state change: unavailable -> disconnected (reason 'carrier-changed', sys-iface-state: 'managed')                          
Dec 05 10:57:37 9d52738 NetworkManager[805]: <info>  [1575543457.8011] policy: auto-activating connection 'Wired connection 1' (b7cd7ae6-6bbe-3c64-b4a7-d687b5550671)                                           
Dec 05 10:57:37 9d52738 NetworkManager[805]: <info>  [1575543457.8034] device (eth0): Activation: starting connection 'Wired connection 1' (b7cd7ae6-6bbe-3c64-b4a7-d687b5550671)                               
Dec 05 10:57:37 9d52738 NetworkManager[805]: <info>  [1575543457.8040] device (eth0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')                                         
Dec 05 10:57:37 9d52738 NetworkManager[805]: <info>  [1575543457.8056] manager: NetworkManager state is now CONNECTING                                                                                          
Dec 05 10:57:37 9d52738 NetworkManager[805]: <info>  [1575543457.8067] device (eth0): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')                                               
Dec 05 10:57:37 9d52738 NetworkManager[805]: <info>  [1575543457.8099] device (eth0): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed')                                             
Dec 05 10:57:37 9d52738 NetworkManager[805]: <info>  [1575543457.8116] dhcp4 (eth0): activation: beginning transaction (no timeout)                                                                             
Dec 05 10:57:44 9d52738 NetworkManager[805]: <info>  [1575543464.8324] dhcp4 (eth0): state changed unknown -> bound                                                                                             
Dec 05 10:57:44 9d52738 NetworkManager[805]: <info>  [1575543464.9513] device (eth0): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed')                                           
Dec 05 10:57:44 9d52738 NetworkManager[805]: <info>  [1575543464.9942] device (veth77b7824): driver '(null)' does not support carrier detection.                                                                
Dec 05 10:57:44 9d52738 NetworkManager[805]: <info>  [1575543464.9947] device (veth77b7824): driver 'veth' does not support carrier detection.                                                                  
Dec 05 10:57:44 9d52738 NetworkManager[805]: <info>  [1575543464.9960] manager: (veth77b7824): new Veth device (/org/freedesktop/NetworkManager/Devices/387)                                                    
Dec 05 10:57:45 9d52738 NetworkManager[805]: <info>  [1575543465.0080] device (eth0): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed')                                         
Dec 05 10:57:45 9d52738 NetworkManager[805]: <info>  [1575543465.0101] device (vethef8705c): released from master device br-4a4858757201                                                                        
Dec 05 10:57:45 9d52738 NetworkManager[805]: <info>  [1575543465.0117] device (eth0): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed')                                        
Dec 05 10:57:45 9d52738 NetworkManager[805]: <info>  [1575543465.0138] manager: NetworkManager state is now CONNECTED_LOCAL                                                                                     
Dec 05 10:57:45 9d52738 NetworkManager[805]: <info>  [1575543465.0180] manager: NetworkManager state is now CONNECTED_SITE                                                                                      
Dec 05 10:57:45 9d52738 NetworkManager[805]: <info>  [1575543465.0186] policy: set 'Wired connection 1' (eth0) as default for IPv4 routing and DNS                                                              
Dec 05 10:57:45 9d52738 NetworkManager[805]: <info>  [1575543465.0199] dns-mgr: Writing DNS information to /sbin/resolvconf                                                                                     
Dec 05 10:57:45 9d52738 NetworkManager[805]: <info>  [1575543465.0568] device (eth0): Activation: successful, device activated.                                                                                 
Dec 05 10:57:45 9d52738 NetworkManager[805]: <info>  [1575543465.3844] manager: (veth53e8506): new Veth device (/org/freedesktop/NetworkManager/Devices/388)                                                    
Dec 05 10:57:45 9d52738 NetworkManager[805]: <info>  [1575543465.3895] manager: (vethcb320ff): new Veth device (/org/freedesktop/NetworkManager/Devices/389)                                                    
Dec 05 10:57:45 9d52738 NetworkManager[805]: <info>  [1575543465.6500] manager: NetworkManager state is now CONNECTED_GLOBAL                                                                                    
Dec 05 10:57:46 9d52738 NetworkManager[805]: <info>  [1575543466.8585] device (vethcb320ff): carrier: link connected                                                                                            
Dec 05 10:59:32 9d52738 NetworkManager[805]: <info>  [1575543572.8378] manager: (resin-vpn): new Tun device (/org/freedesktop/NetworkManager/Devices/390)

It seems the journal does only show the last 15 minutes. Same problem with the ModemManager log:

-- Logs begin at Thu 2019-12-05 10:40:56 UTC, end at Thu 2019-12-05 11:00:03 UTC. --                                                                                                                            
-- No entries --

When I check the modem, it has disappeared:

root@9d52738:~# mmcli -L
No modems were found

Output from nmcli:

root@9d52738:~# nmcli c
NAME                UUID                                  TYPE      DEVICE      
Wired connection 1  b7cd7ae6-6bbe-3c64-b4a7-d687b5550671  ethernet  eth0        
supervisor0         a0dd8e69-49ab-4946-ba7a-b2036e96f530  bridge    supervisor0 
gsm                 d8f002d1-dfc3-3cad-ac58-c7d3cc563b75  gsm       --

root@9d52738:~# nmcli c
NAME                UUID                                  TYPE      DEVICE      
Wired connection 1  b7cd7ae6-6bbe-3c64-b4a7-d687b5550671  ethernet  eth0        
supervisor0         a0dd8e69-49ab-4946-ba7a-b2036e96f530  bridge    supervisor0 
gsm                 d8f002d1-dfc3-3cad-ac58-c7d3cc563b75  gsm       --

This is the NetworkManager configuration:

[connection]
id=gsm
type=gsm
autoconnect=true
autoconnect-retries=0

[gsm]
apn=web.vodafone.de

[ppp]
no-vj-comp=true
nobsdcomp=true
nodeflate=true

[ipv4]
method=auto

[ipv6]
method=ignore

Note: I had to disable ipv6 and compression otherwise I do not get a connection via my provider.

I heard that the journals might be cut off because the journal size is limited.

I hope the get some advice for further debugging. I do not yet have a serial cable, but am able to connect via Ethernet.

Thank you
Bruno

Hi,

Thank you for the report. It’s curious that the modem does intialy work only to lose connectivity (and the device being recognised by the OS). Would you mind granting support access for us to look at this device, assuming it’s still wired in via ethernet, and providing the dashboard URL for it or the UUID.

Best regards,

Heds

Sure. I have granted support access.
Here is the UUID: 9d52738eabd9c17500c6ff1fe6ed8d86

Many thanks, we’ll have a look and get back to you when we have more information. Just to confirm, you say initially in the ticket title that it’s approximately 8 hours before the device goes offline, but in the body of the text you say a couple of hours. Is it fair to say that the timing for this is fairly random and not consistent?

Also, can you just confirm that initially, you do see the modem listed in nmcli -L, and then it vanishes when the GSM connection drops?

Best regards,

Heds

Correct. I had several occurrences of a device going offline and the time it takes seems random and not consistent. After reboot, I can see the modem using nmcli -L.

Hello again,

We’ve had a look at this issue. Whilst the ModemManager service has lost connectivity to the modem, the device is still recognised and present on the USB bus. There didn’t appear to be any reason for why the modem wasn’t recognised by ModemManager, and a restart of this service has now picked the modem up again.

It appears at this point that there’s an issue with ModemManager not refreshing the list post-startup. I’m going to contact one of our engineers who has a lot of experience in this area and see if he has some suggestions or insights to offer. We’ll obviously get back to you as soon as we have more information.

In the meantime, a shortterm fix might be to have a service container which continually tests for the present of the modem and, should it go missing, restart the ModemManager service via the DBus API.

Best regards,

Heds

I will get myself started with using the Dbus API. Regarding a similar topic, I had also issued a feature request:

I have a different device (based on amd64) having a similar issue. It went offline. When I connect it to a LAN router, I get a little different outputs:

The modem exists in ModemManager:

root@f14c40f:~# mmcli -L
    /org/freedesktop/ModemManager1/Modem/0 [Huawei Technologies Co., Ltd.] ME909s-120

ModemManager log:

-- Logs begin at Fri 2019-12-06 09:59:25 UTC, end at Fri 2019-12-06 09:59:38 UTC. --
Dec 06 09:59:25 f14c40f ModemManager[627]: Modem /org/freedesktop/ModemManager1/Modem/0: state changed (connecting -> registered)
Dec 06 09:59:25 f14c40f ModemManager[627]: Simple connect started...
Dec 06 09:59:25 f14c40f ModemManager[627]: Simple connect state (4/8): Wait to get fully enabled
Dec 06 09:59:25 f14c40f ModemManager[627]: Simple connect state (5/8): Register
Dec 06 09:59:25 f14c40f ModemManager[627]: Simple connect state (6/8): Bearer
Dec 06 09:59:25 f14c40f ModemManager[627]: Simple connect state (7/8): Connect
Dec 06 09:59:25 f14c40f ModemManager[627]: Modem /org/freedesktop/ModemManager1/Modem/0: state changed (registered -> connecting)
Dec 06 09:59:25 f14c40f ModemManager[627]: Modem /org/freedesktop/ModemManager1/Modem/0: state changed (connecting -> registered)
Dec 06 09:59:25 f14c40f ModemManager[627]: Simple connect started...
Dec 06 09:59:25 f14c40f ModemManager[627]: Simple connect state (4/8): Wait to get fully enabled
Dec 06 09:59:25 f14c40f ModemManager[627]: Simple connect state (5/8): Register
Dec 06 09:59:25 f14c40f ModemManager[627]: Simple connect state (6/8): Bearer
Dec 06 09:59:25 f14c40f ModemManager[627]: Simple connect state (7/8): Connect
Dec 06 09:59:25 f14c40f ModemManager[627]: Modem /org/freedesktop/ModemManager1/Modem/0: state changed (registered -> connecting)
Dec 06 09:59:25 f14c40f ModemManager[627]: Modem /org/freedesktop/ModemManager1/Modem/0: state changed (connecting -> registered)
Dec 06 09:59:25 f14c40f ModemManager[627]: Simple connect started...
Dec 06 09:59:25 f14c40f ModemManager[627]: Simple connect state (4/8): Wait to get fully enabled
Dec 06 09:59:25 f14c40f ModemManager[627]: Simple connect state (5/8): Register
Dec 06 09:59:25 f14c40f ModemManager[627]: Simple connect state (6/8): Bearer
Dec 06 09:59:25 f14c40f ModemManager[627]: Simple connect state (7/8): Connect
...
Dec 06 09:59:31 f14c40f ModemManager[627]: Modem /org/freedesktop/ModemManager1/Modem/0: state changed (connecting -> registered)
Dec 06 09:59:31 f14c40f ModemManager[627]: Simple connect started...
Dec 06 09:59:31 f14c40f ModemManager[627]: Simple connect state (4/8): Wait to get fully enabled
Dec 06 09:59:31 f14c40f ModemManager[627]: Simple connect state (5/8): Register
Dec 06 09:59:31 f14c40f ModemManager[627]: Simple connect state (6/8): Bearer
Dec 06 09:59:31 f14c40f ModemManager[627]: Simple connect state (7/8): Connect
Dec 06 09:59:31 f14c40f ModemManager[627]: Modem /org/freedesktop/ModemManager1/Modem/0: state changed (registered -> connecting)
Dec 06 09:59:31 f14c40f ModemManager[627]: Modem /org/freedesktop/ModemManager1/Modem/0: state changed (connecting -> registered)

NetworkManager log:

-- Logs begin at Fri 2019-12-06 10:00:38 UTC, end at Fri 2019-12-06 10:00:51 UTC. --
Dec 06 10:00:38 f14c40f NetworkManager[716]: <warn>  [1575626438.4044] device (ttyUSB0): Activation: failed for connection 'gsm'
Dec 06 10:00:38 f14c40f NetworkManager[716]: <info>  [1575626438.4060] device (ttyUSB0): state change: failed -> disconnected (reason 'none', sys-iface-state: 'manage>
Dec 06 10:00:38 f14c40f NetworkManager[716]: <info>  [1575626438.4071] policy: auto-activating connection 'gsm' (50e657ec-3340-33f3-8c90-5190e0ccd2ec)
Dec 06 10:00:38 f14c40f NetworkManager[716]: <info>  [1575626438.4081] device (ttyUSB0): Activation: starting connection 'gsm' (50e657ec-3340-33f3-8c90-5190e0ccd2ec)
Dec 06 10:00:38 f14c40f NetworkManager[716]: <info>  [1575626438.4084] device (ttyUSB0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'manag>
Dec 06 10:00:38 f14c40f NetworkManager[716]: <info>  [1575626438.4113] modem["ttyUSB0"]: modem state changed, 'registered' --> 'connecting' (reason: user-requested)
Dec 06 10:00:38 f14c40f NetworkManager[716]: <info>  [1575626438.4541] modem["ttyUSB0"]: modem state changed, 'connecting' --> 'registered' (reason: user-requested)
Dec 06 10:00:38 f14c40f NetworkManager[716]: <warn>  [1575626438.4543] modem-broadband[ttyUSB0]: failed to connect modem: No cause information available
Dec 06 10:00:38 f14c40f NetworkManager[716]: <info>  [1575626438.4544] device (ttyUSB0): state change: prepare -> failed (reason 'unknown', sys-iface-state: 'managed')
Dec 06 10:00:38 f14c40f NetworkManager[716]: <warn>  [1575626438.4564] device (ttyUSB0): Activation: failed for connection 'gsm'
Dec 06 10:00:38 f14c40f NetworkManager[716]: <info>  [1575626438.4573] device (ttyUSB0): state change: failed -> disconnected (reason 'none', sys-iface-state: 'manage>
Dec 06 10:00:38 f14c40f NetworkManager[716]: <info>  [1575626438.4590] policy: auto-activating connection 'gsm' (50e657ec-3340-33f3-8c90-5190e0ccd2ec)
Dec 06 10:00:38 f14c40f NetworkManager[716]: <info>  [1575626438.4605] device (ttyUSB0): Activation: starting connection 'gsm' (50e657ec-3340-33f3-8c90-5190e0ccd2ec)
Dec 06 10:00:38 f14c40f NetworkManager[716]: <info>  [1575626438.4609] device (ttyUSB0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'manag>
Dec 06 10:00:38 f14c40f NetworkManager[716]: <info>  [1575626438.4650] modem["ttyUSB0"]: modem state changed, 'registered' --> 'connecting' (reason: user-requested)
Dec 06 10:00:38 f14c40f NetworkManager[716]: <info>  [1575626438.5071] modem["ttyUSB0"]: modem state changed, 'connecting' --> 'registered' (reason: user-requested)
Dec 06 10:00:38 f14c40f NetworkManager[716]: <warn>  [1575626438.5073] modem-broadband[ttyUSB0]: failed to connect modem: No cause information available
...
Dec 06 10:00:41 f14c40f NetworkManager[716]: <warn>  [1575626441.0631] device (ttyUSB0): Activation: failed for connection 'gsm'
Dec 06 10:00:41 f14c40f NetworkManager[716]: <info>  [1575626441.0638] device (ttyUSB0): state change: failed -> disconnected (reason 'none', sys-iface-state: 'manage>
Dec 06 10:00:41 f14c40f NetworkManager[716]: <info>  [1575626441.0650] policy: auto-activating connection 'gsm' (50e657ec-3340-33f3-8c90-5190e0ccd2ec)
Dec 06 10:00:41 f14c40f NetworkManager[716]: <info>  [1575626441.0665] device (ttyUSB0): Activation: starting connection 'gsm' (50e657ec-3340-33f3-8c90-5190e0ccd2ec)
Dec 06 10:00:41 f14c40f NetworkManager[716]: <info>  [1575626441.0668] device (ttyUSB0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'manag>
Dec 06 10:00:41 f14c40f NetworkManager[716]: <info>  [1575626441.0707] modem["ttyUSB0"]: modem state changed, 'registered' --> 'connecting' (reason: user-requested)
Dec 06 10:00:41 f14c40f NetworkManager[716]: <info>  [1575626441.1140] modem["ttyUSB0"]: modem state changed, 'connecting' --> 'registered' (reason: user-requested)
root@f14c40f:~# nmcli d
DEVICE           TYPE      STATE                 CONNECTION         
eno1             ethernet  connected             Wired connection 1 
supervisor0      bridge    connected             supervisor0        
ttyUSB0          gsm       connecting (prepare)  gsm                
enp1s0           ethernet  unavailable           --                 
balena0          bridge    unmanaged             --                 
br-58d3440f331a  bridge    unmanaged             --                 
resin-dns        bridge    unmanaged             --                 
veth42e9ac2      ethernet  unmanaged             --                 
veth74d0269      ethernet  unmanaged             --                 
vetha0892ad      ethernet  unmanaged             --                 
vethc5c5a5a      ethernet  unmanaged             --                 
vethf880a90      ethernet  unmanaged             --                 
sit0             iptunnel  unmanaged             --                 
lo               loopback  unmanaged             --                 
resin-vpn        tun       unmanaged             --
root@f14c40f:~# nmcli c

(process:26293): GLib-GIO-WARNING **: 10:03:20.384: ../glib-2.58.3/gio/gdbusobjectmanagerclient.c:1589: Processing InterfaceRemoved signal for path /org/freedesktop/NetworkManager/ActiveConnection/832596 but no object proxy exists
NAME                UUID                                  TYPE      DEVICE      
gsm                 50e657ec-3340-33f3-8c90-5190e0ccd2ec  gsm       ttyUSB0     
Wired connection 1  3772cbf4-02e4-35c5-81f0-a1ebec348327  ethernet  eno1        
supervisor0         1907a628-d681-47b1-a1a3-99473bf571f3  bridge    supervisor0 
Wired connection 2  a9183783-6bf9-3e31-bab6-44e481c812ba  ethernet  --

When trying to reconnect:

root@f14c40f:~# nmcli c down gsm
Connection 'gsm' deactivation failed: The connection was not active.
root@f14c40f:~# nmcli c up gsm

(process:26630): GLib-GIO-WARNING **: 10:04:33.012: ../glib-2.58.3/gio/gdbusobjectmanagerclient.c:1589: Processing InterfaceRemoved signal for path /org/freedesktop/NetworkManager/ActiveConnection/833997 but no object proxy exists
Error: Connection activation failed: Unknown error
Hint: use 'journalctl -xe NM_CONNECTION=50e657ec-3340-33f3-8c90-5190e0ccd2ec + NM_DEVICE=ttyUSB0' to get more details.

Trying to restart ModemManager:

root@f14c40f:~# systemctl restart ModemManager.service
root@f14c40f:~# nmcli c
NAME                UUID                                  TYPE      DEVICE
Wired connection 1  3772cbf4-02e4-35c5-81f0-a1ebec348327  ethernet  eno1
supervisor0         1907a628-d681-47b1-a1a3-99473bf571f3  bridge    supervisor0
Wired connection 2  a9183783-6bf9-3e31-bab6-44e481c812ba  ethernet  --
gsm                 50e657ec-3340-33f3-8c90-5190e0ccd2ec  gsm       --
root@f14c40f:~# nmcli d
DEVICE           TYPE      STATE        CONNECTION         
eno1             ethernet  connected    Wired connection 1 
supervisor0      bridge    connected    supervisor0        
enp1s0           ethernet  unavailable  --                 
balena0          bridge    unmanaged    --                 
br-58d3440f331a  bridge    unmanaged    --
resin-dns        bridge    unmanaged    --
veth42e9ac2      ethernet  unmanaged    --
veth74d0269      ethernet  unmanaged    --
vetha0892ad      ethernet  unmanaged    --
vethc5c5a5a      ethernet  unmanaged    --
vethf880a90      ethernet  unmanaged    --
sit0             iptunnel  unmanaged    --
lo               loopback  unmanaged    --
resin-vpn        tun       unmanaged    --
root@f14c40f:~# nmcli c down gsm
Connection 'gsm' successfully deactivated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/834910)
root@f14c40f:~# nmcli c up gsm
Connection successfully activated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/834911)
root@f14c40f:~# nmcli c
NAME                UUID                                  TYPE      DEVICE
Wired connection 1  3772cbf4-02e4-35c5-81f0-a1ebec348327  ethernet  eno1
gsm                 50e657ec-3340-33f3-8c90-5190e0ccd2ec  gsm       ttyUSB0
supervisor0         1907a628-d681-47b1-a1a3-99473bf571f3  bridge    supervisor0
Wired connection 2  a9183783-6bf9-3e31-bab6-44e481c812ba  ethernet  --
root@f14c40f:~# nmcli d
DEVICE           TYPE      STATE         CONNECTION
eno1             ethernet  connected     Wired connection 1
supervisor0      bridge    connected     supervisor0
ttyUSB0          gsm       connected     gsm
ppp0             ppp       disconnected  --
enp1s0           ethernet  unavailable   --
balena0          bridge    unmanaged     --
br-58d3440f331a  bridge    unmanaged     --
resin-dns        bridge    unmanaged     --
veth42e9ac2      ethernet  unmanaged     --
veth74d0269      ethernet  unmanaged     --
vetha0892ad      ethernet  unmanaged     --
vethc5c5a5a      ethernet  unmanaged     --
vethf880a90      ethernet  unmanaged     --
sit0             iptunnel  unmanaged     --
lo               loopback  unmanaged     --
resin-vpn        tun       unmanaged     --

… now GSM is online again.

So in this case, restarting ModemManager alone is not enough, I had to restart the connection manually.

The first (CM3) device went offline again.

In this case, I was able to manually recover GSM according to your workaround suggestion, after obtaining access via LAN:

root@9d52738:~# mmcli -L
No modems were found
root@9d52738:~# systemctl restart ModemManager.service
root@9d52738:~# mmcli -L
No modems were found
root@9d52738:~# mmcli -L
No modems were found
root@9d52738:~# mmcli -L
No modems were found
root@9d52738:~# mmcli -L
    /org/freedesktop/ModemManager1/Modem/0 [Huawei Technologies Co., Ltd.] ME909s-120
root@9d52738:~# nmcli c 
NAME                UUID                                  TYPE      DEVICE      
Wired connection 1  b7cd7ae6-6bbe-3c64-b4a7-d687b5550671  ethernet  eth0        
gsm                 d8f002d1-dfc3-3cad-ac58-c7d3cc563b75  gsm       ttyUSB1     
supervisor0         a0dd8e69-49ab-4946-ba7a-b2036e96f530  bridge    supervisor0 
root@9d52738:~# nmcli d
DEVICE           TYPE      STATE         CONNECTION         
eth0             ethernet  connected     Wired connection 1 
supervisor0      bridge    connected     supervisor0        
ttyUSB1          gsm       connected     gsm                
ppp0             ppp       disconnected  --                 
balena0          bridge    unmanaged     --                 
br-4a4858757201  bridge    unmanaged     --                 
resin-dns        bridge    unmanaged     --                 
veth607d783      ethernet  unmanaged     --                 
veth856174e      ethernet  unmanaged     --                 
vethd462048      ethernet  unmanaged     --                 
vethe6faf8c      ethernet  unmanaged     --                 
vethf0bd237      ethernet  unmanaged     --                 
lo               loopback  unmanaged     --                 
resin-vpn        tun       unmanaged     --

Hi again,

Thank you very much for the detailed information and extra information on the case where a restart of ModemManager is not working. The relevant engineer is hoping to look at these cases soon, and we’ll get back to you as soon as possible with some more information when we have it.

Best regards,

Heds

Hi Bruno,

There are a couple of steps I would recommend:

Get in touch with the ModemManager developers on the ModemManager-devel mailing list. Describe the issue you face, post the NetworkManager connection profile and the journal logs. They may ask you to provide the necessary diagnostics information if needed. They will most certainly look at your email and respond to you.

If they are not able to come up with a resolution for this or it will take too much time, then you may try other modem models.

You may also implement some code in your application that could workaround the problem: e.g. restart ModemManager, brings the NetworkManager connection up and down, but that should be more like a last resort option if you do not have a way to use a different modem model.

The fourth most difficult option is to debug this in ModemManager and coordinate with the ModemManager developers a fix for it.

Please let us know if you have any questions.

Thanks,
Zahari

Hi Zahari,

are the ModemManager developers aware of balena and the way balena handles network setup? I am not sure if I am able to explain the setup appropriately. I have almost no knowledge of NetworkManager and ModemManager.

We re running a very standard Linux setup with NetworkManager and ModemManager. Nothing out of the ordinary compared to the more popular Linux distributions like Ubuntu or Arch. Things should be fine in this regard. You may post the ModemManager and NetworkManager versions to them - mmcli --version and nmcli --version.

I am waiting for a response by the MM team. Is it possible that having udev activated in one of my application containers interferes with the modem detection?

Edit: Probably not. Even after stopping all my services the modem still disappears.

Meanwhile I have managed to capture logs from ModemManager, presumably during the connection drop.

-- Logs begin at Sun 2019-12-08 01:46:57 UTC, end at Sun 2019-12-08 08:04:07 UTC. --

Dec 08 03:05:11 9d52738 ModemManager[19463]: Modem /org/freedesktop/ModemManager1/Modem/0: 3GPP Registration state changed (home -> idle)

Dec 08 03:05:12 9d52738 ModemManager[19463]: Modem /org/freedesktop/ModemManager1/Modem/0: state changed (connected -> disconnecting)

Dec 08 03:05:12 9d52738 ModemManager[19463]: Simple connect started...

Dec 08 03:05:12 9d52738 ModemManager[19463]: Simple connect state (4/8): Wait to get fully enabled

Dec 08 03:05:12 9d52738 ModemManager[19463]: Modem /org/freedesktop/ModemManager1/Modem/0: 3GPP Registration state changed (idle -> searching)

Dec 08 03:05:13 9d52738 ModemManager[19463]: Modem /org/freedesktop/ModemManager1/Modem/0: 3GPP Registration state changed (searching -> registering)

Dec 08 03:05:14 9d52738 ModemManager[19463]: Modem /org/freedesktop/ModemManager1/Modem/0: state changed (disconnecting -> searching)

Dec 08 03:05:14 9d52738 ModemManager[19463]: Simple connect state (5/8): Register

Dec 08 03:05:16 9d52738 ModemManager[19463]: checking if connected failed: Couldn't check current list of active PDP contexts: Sending command failed: 'Resource temporarily unavailable'

Dec 08 03:05:20 9d52738 ModemManager[19463]: (tty/ttyUSB1) at port timed out 2 consecutive times

Dec 08 03:05:23 9d52738 ModemManager[19463]: (tty/ttyUSB1) at port timed out 3 consecutive times

Dec 08 03:05:23 9d52738 ModemManager[19463]: Couldn't load Operator Code: 'Sending command failed: 'Resource temporarily unavailable''

Dec 08 03:05:26 9d52738 ModemManager[19463]: (tty/ttyUSB1) at port timed out 4 consecutive times

Dec 08 03:05:30 9d52738 ModemManager[19463]: (tty/ttyUSB1) at port timed out 5 consecutive times

Dec 08 03:05:33 9d52738 ModemManager[19463]: (tty/ttyUSB1) at port timed out 6 consecutive times

Dec 08 03:05:36 9d52738 ModemManager[19463]: (tty/ttyUSB1) at port timed out 7 consecutive times

Dec 08 03:05:40 9d52738 ModemManager[19463]: (tty/ttyUSB1) at port timed out 8 consecutive times

Dec 08 03:05:43 9d52738 ModemManager[19463]: (tty/ttyUSB1) at port timed out 9 consecutive times

Dec 08 03:05:43 9d52738 ModemManager[19463]: Modem /org/freedesktop/ModemManager1/Modem/0: 3GPP Registration state changed (searching -> idle)

Dec 08 03:05:43 9d52738 ModemManager[19463]: Simple connect started...

Dec 08 03:05:43 9d52738 ModemManager[19463]: Simple connect state (4/8): Wait to get fully enabled

Dec 08 03:05:43 9d52738 ModemManager[19463]: Simple connect state (5/8): Register

Dec 08 03:05:46 9d52738 ModemManager[19463]: (tty/ttyUSB1) at port timed out 10 consecutive times, marking modem '/org/freedesktop/ModemManager1/Modem/0' as invalid

Dec 08 03:05:50 9d52738 ModemManager[19463]: (tty/ttyUSB1) at port timed out 11 consecutive times, marking modem '/org/freedesktop/ModemManager1/Modem/0' as invalid

Dec 08 03:05:50 9d52738 ModemManager[19463]: Couldn't load Operator Name: 'AT command was cancelled'

Dec 08 03:05:50 9d52738 ModemManager[19463]: Modem /org/freedesktop/ModemManager1/Modem/0: 3GPP Registration state changed (registering -> home)

Dec 08 03:05:53 9d52738 ModemManager[19463]: (tty/ttyUSB1) at port timed out 12 consecutive times, marking modem '/org/freedesktop/ModemManager1/Modem/0' as invalid

Dec 08 03:05:53 9d52738 ModemManager[19463]: Modem /org/freedesktop/ModemManager1/Modem/0: 3GPP Registration state changed (home -> idle)

And also the NetworkManager log

-- Logs begin at Sun 2019-12-08 01:46:57 UTC, end at Sun 2019-12-08 08:04:07 UTC. --

Dec 08 02:32:17 9d52738 NetworkManager[805]: <info>  [1575772337.6999] dhcp4 (eth0): state changed bound -> bound

Dec 08 03:05:11 9d52738 NetworkManager[805]: LCP terminated by peer

Dec 08 03:05:11 9d52738 NetworkManager[805]: nm-pppd-plugin-Message: 03:05:11.926: nm-ppp-plugin: status 8 / phase 'network'

Dec 08 03:05:11 9d52738 NetworkManager[805]: Connect time 357.1 minutes.

Dec 08 03:05:11 9d52738 NetworkManager[805]: Sent 0 bytes, received 4 bytes.

Dec 08 03:05:11 9d52738 NetworkManager[805]: nm-pppd-plugin-Message: 03:05:11.947: nm-ppp-plugin: status 5 / phase 'establish'

Dec 08 03:05:11 9d52738 NetworkManager[805]: <info>  [1575774311.9490] device (ppp0): state change: disconnected -> unmanaged (reason 'connection-assumed', sys-iface-state: 'external')

Dec 08 03:05:11 9d52738 NetworkManager[805]: Modem hangup

Dec 08 03:05:11 9d52738 NetworkManager[805]: nm-pppd-plugin-Message: 03:05:11.980: nm-ppp-plugin: status 11 / phase 'disconnect'

Dec 08 03:05:11 9d52738 NetworkManager[805]: Connection terminated.

Dec 08 03:05:11 9d52738 NetworkManager[805]: <info>  [1575774311.9897] device (ttyUSB1): state change: activated -> failed (reason 'ip-config-unavailable', sys-iface-state: 'managed')

Dec 08 03:05:12 9d52738 NetworkManager[805]: <warn>  [1575774312.0031] device (ttyUSB1): Activation: failed for connection 'gsm'

Dec 08 03:05:12 9d52738 NetworkManager[805]: <info>  [1575774312.0073] modem["ttyUSB1"]: modem state changed, 'connected' --> 'disconnecting' (reason: user-requested)

Dec 08 03:05:12 9d52738 NetworkManager[805]: <info>  [1575774312.0104] device (ttyUSB1): state change: failed -> disconnected (reason 'none', sys-iface-state: 'managed')

Dec 08 03:05:12 9d52738 NetworkManager[805]: <info>  [1575774312.0139] dns-mgr: Writing DNS information to /sbin/resolvconf

Dec 08 03:05:12 9d52738 NetworkManager[805]: <info>  [1575774312.0603] policy: auto-activating connection 'gsm' (d8f002d1-dfc3-3cad-ac58-c7d3cc563b75)

Dec 08 03:05:12 9d52738 NetworkManager[805]: <info>  [1575774312.0635] device (ttyUSB1): Activation: starting connection 'gsm' (d8f002d1-dfc3-3cad-ac58-c7d3cc563b75)

Dec 08 03:05:12 9d52738 NetworkManager[805]: <info>  [1575774312.0650] device (ttyUSB1): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')

Dec 08 03:05:13 9d52738 NetworkManager[805]: nm-pppd-plugin-Message: 03:05:13.043: nm-ppp-plugin: status 1 / phase 'dead'

Dec 08 03:05:13 9d52738 NetworkManager[805]: nm-pppd-plugin-Message: 03:05:13.044: nm-ppp-plugin: cleaning up

Dec 08 03:05:14 9d52738 NetworkManager[805]: <info>  [1575774314.0246] modem["ttyUSB1"]: modem state changed, 'disconnecting' --> 'searching' (reason: user-requested)

Dec 08 03:05:43 9d52738 NetworkManager[805]: <warn>  [1575774343.4300] modem-broadband[ttyUSB1]: failed to connect modem: Network timeout

Dec 08 03:05:43 9d52738 NetworkManager[805]: <info>  [1575774343.4303] device (ttyUSB1): state change: prepare -> failed (reason 'gsm-registration-timeout', sys-iface-state: 'managed')

Dec 08 03:05:43 9d52738 NetworkManager[805]: <warn>  [1575774343.4398] device (ttyUSB1): Activation: failed for connection 'gsm'

Dec 08 03:05:43 9d52738 NetworkManager[805]: <info>  [1575774343.4426] device (ttyUSB1): state change: failed -> disconnected (reason 'none', sys-iface-state: 'managed')

Dec 08 03:05:43 9d52738 NetworkManager[805]: <info>  [1575774343.4461] policy: auto-activating connection 'gsm' (d8f002d1-dfc3-3cad-ac58-c7d3cc563b75)

Dec 08 03:05:43 9d52738 NetworkManager[805]: <info>  [1575774343.4503] device (ttyUSB1): Activation: starting connection 'gsm' (d8f002d1-dfc3-3cad-ac58-c7d3cc563b75)

Dec 08 03:05:43 9d52738 NetworkManager[805]: <info>  [1575774343.4513] device (ttyUSB1): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')

Dec 08 03:05:46 9d52738 NetworkManager[805]: <info>  [1575774346.7768] device (ttyUSB1): state change: prepare -> unmanaged (reason 'removed', sys-iface-state: 'removed')

Dec 08 03:05:46 9d52738 NetworkManager[805]: <warn>  [1575774346.7915] modem-broadband[ttyUSB1]: failed to disconnect modem: GDBus.Error:org.freedesktop.DBus.Error.UnknownMethod: No such interface “org.freedesktop.ModemManager1.Modem.Simple” on object at path /org/freedesktop/ModemManager1/Modem/0

Dec 08 07:54:10 9d52738 NetworkManager[805]: <info>  [1575791650.4080] dhcp4 (eth0): state changed bound -> Bound

This issue shows the same symptoms:
https://lists.freedesktop.org/archives/modemmanager-devel/2019-May/007268.html

The modem is disabled after 10 timeouts. After that, connection is never retried.

Unfortunately, there is no proposed fix.

Since there is no proposed fixed on the MM side, I think the 2 options left are changing the modem or implementing a workaround on the application side, for instance, to restart the ModemManager with some period of time.
Please let us know which one you’ll go with.
Thanks!

Well I hope they will pick up the issue after all. It is absolutely crucial to have a recovery for a lost GSM connection. Getting hold of remotely installed field devices that went offline is not an option.

We will not change the modem because, as I already said, they work 100% stable in our non balena setup where we use only pppd and wvdial.

Running a watchdog container would be a workaround I might give a try, but I am not completely happy with that.

Anyway, thanks for the support so far. I will let you know if there are any new insights.