Cell Modem Enabling State

Hello,

We have an application running on the Balena Fin v1.1.1 that uses a Quectel EC25 modem and we have deployed it successfully in multiple countries. We recently tried replacing the EC25 with an EG21-G and an EG25-G into two separate products, and they connected and ran fine for about 24 hours but then disconnected and have remained disconnected ever since (4 days). I was able to connect serially to determine the state, and here are some things I found.

The “mmcli -m 0” command shows the Status state is “enabling”, but no “bearer” is listed in the output, which is unusual. Over the course of an hour or two, and through various commands I have tried, the state has always been “enabling”.

Here’s a snippet from the output of “journalctl -u ModemManager --no-pager”:

Jun 07 15:52:18 db1f1bd ModemManager[1400]: [modem0] state changed (enabling -> disabled)
Jun 07 15:52:18 db1f1bd ModemManager[1400]: [modem0] simple connect started...
Jun 07 15:52:18 db1f1bd ModemManager[1400]: [modem0] simple connect state (3/8): enable
Jun 07 15:52:18 db1f1bd ModemManager[1400]: [modem0] state changed (disabled -> enabling)
Jun 07 15:52:18 db1f1bd ModemManager[1400]: [modem0] simple connect started...
Jun 07 15:52:18 db1f1bd ModemManager[1400]: [modem0] simple connect state (4/8): wait to get fully enabled
Jun 07 15:52:18 db1f1bd ModemManager[1400]: [modem0] state changed (enabling -> disabled)
Jun 07 15:52:18 db1f1bd ModemManager[1400]: [modem0] simple connect started...
Jun 07 15:52:18 db1f1bd ModemManager[1400]: [modem0] simple connect state (3/8): enable
Jun 07 15:52:18 db1f1bd ModemManager[1400]: [modem0] state changed (disabled -> enabling)
Jun 07 15:52:18 db1f1bd ModemManager[1400]: [modem0] simple connect started...
Jun 07 15:52:18 db1f1bd ModemManager[1400]: [modem0] simple connect state (4/8): wait to get fully enabled
Jun 07 15:52:19 db1f1bd ModemManager[1400]: [modem0] state changed (enabling -> disabled)

This cycle repeats about 4x every second (from “simple connect started” to “simple connect state (4/8): wait to get fully enabled”).

I tried issuing the following commands, and got the following responses:

mmcli -m 0 --set-power-state-low

"error: couldn't set new power state in the modem: 'GDBus.Error:org.freedesktop.ModemManager1.Error.Core.WrongState: Cannot set power state: not in disabled state'"

mmcli -m 0 --simple-connect="apn={our_wireless_apn}"

"error: couldn't connect the modem: 'GDBus.Error:org.freedesktop.libqmi.Error.Protocol.InvalidTransition: Couldn't set operating mode: QMI protocol error (60): 'InvalidTransition''" 

mmcli -m 0 -d

"successfully disabled the modem"

mmcli -m 0 -e

"error: couldn't enable the modem: 'GDBus.Error:org.freedesktop.libqmi.Error.Protocol.InvalidTransition: Couldn't set operating mode: QMI protocol error (60): 'InvalidTransition''"

mmcli -m 0 r

"successfully reseted the modem"

Although the reset appeared to work, when I query “mmcli -m 0” again, after the reset, the Status state is still “enabling”.

I have looked at our /etc/NetworkManager/system-connections/ cell connection file and it looks the same as it is in any of our working EC25 products.

The cell modem appears to be in a funky state, and even the reset command (-r) cannot seem to recover it. If I physically remove the power cord and plug it in again, the cell modem connects and comes online again just fine.

I’m not sure how it got into this state, and how to recover it once it gets into this state. (Power cycling is not an option for us.) Has anyone seen anything like this? Any suggestions on anything else I could try?

I will also contact the modem vendor to see if we should update the modem firmware.

Hello! could you please confirm your balenaOS version and supervisor version? Thanks

Hello,

Thank you for all of the good diagnostic information - nicely done. And from a serial connection no less. It would be great to get some additional info. I normally wouldn’t ask for quite this much information all at once. But since you clearly know your way around modems…

  • balenaOS version and supervisor version (the request from mpous a bit earlier today).
  • The name of the SIM that you are using, and whether or not this is the same type of SIM that you have used previously.
  • Your country (I’m assuming US) and cellular provider.
  • The APN in your cellular file in system-connections.
  • The same really good diagnostic commands that you used above, but on a successfully connected device e.g. just after reboot.
  • On that connected device, three more requests:
    1. mmcli -m <ModemNumber> . You should get a bearer number as the last line. You made an interesting observation above about the lack of a bearer number. With a connected device, let’s see if you get a bearer number. If you do, please wait a minute, then run the command again and see if you get the same bearer number.
    1. mmcli -b <BearnerNumber> .
    1. qmicli -d /dev/cdc-wdm0 -p --wds-get-profile-list=3gpp . In particular to see if the APN shown matches your cellular file. I’m guessing that it will match, given that you get a connection for 24 hours. But it can’t hurt to check.

Also, I saw last line above about checking with the modem manufacturer for firmware updates. Good plan.

Thank you very much for the speedy replies! Here are the answers to your questions.

On the device with a problematic cell modem (not connected to Balena):

  1. Versions - balena=2.77.0+rev1, sup=12.5.10.
  2. We are using Twilio SIMs on this device and all devices.
  3. Country=US, cellProvider=Twilio
  4. APN in sys-connections cell file: wireless.twilio.com, which is the expected value.

On a good device:
journalctl -u ModemManager --no-pager

-- The output is large, and I there is no "enable" connect state (3/8) message in this log.  Here is a relevant snippet:

Jun 08 01:14:17 c8101c7 ModemManager[1315]: [modem0] state changed (unknown -> disabled)                      
Jun 08 01:14:17 c8101c7 ModemManager[1315]: [modem0] state changed (disabled -> enabling)          
Jun 08 01:14:17 c8101c7 ModemManager[1315]: [modem0] simple connect started...                               
Jun 08 01:14:17 c8101c7 ModemManager[1315]: [modem0] simple connect state (4/8): wait to get fully enabled                                                
Jun 08 01:14:17 c8101c7 ModemManager[1315]: [/dev/cdc-wdm0] Allocating new client ID...                        
Jun 08 01:14:17 c8101c7 ModemManager[1315]: [/dev/cdc-wdm0] Registered 'wds' (version 1.67) client with ID '20'                                       
Jun 08 01:14:17 c8101c7 ModemManager[1315]: [/dev/cdc-wdm0] Releasing 'wds' client with flags 'release-cid'...                  
Jun 08 01:14:17 c8101c7 ModemManager[1315]: [/dev/cdc-wdm0] Unregistered 'wds' client with ID '20'                                                    
Jun 08 01:14:18 c8101c7 ModemManager[1315]: [modem0] state changed (enabling -> enabled)                                        
Jun 08 01:14:18 c8101c7 ModemManager[1315]: [modem0] simple connect state (5/8): register
Jun 08 01:14:18 c8101c7 ModemManager[1315]: [modem0] simple connect state (6/8): bearer                   
Jun 08 01:14:18 c8101c7 ModemManager[1315]: [modem0] simple connect state (7/8): connect 
Jun 08 01:14:18 c8101c7 ModemManager[1315]: [modem0] state changed (enabled -> connecting)                     
Jun 08 01:14:18 c8101c7 ModemManager[1315]: [/dev/cdc-wdm0] Allocating new client ID...                       
Jun 08 01:14:18 c8101c7 ModemManager[1315]: [/dev/cdc-wdm0] Registered 'wds' (version 1.67) client with ID '20'
Jun 08 01:14:18 c8101c7 ModemManager[1315]: [modem0] 3GPP Registration state changed (unknown -> registering) 
Jun 08 01:14:18 c8101c7 ModemManager[1315]: [modem0] 3GPP Registration state changed (registering -> roaming)
Jun 08 01:14:18 c8101c7 ModemManager[1315]: [modem0] state changed (connecting -> registered)
Jun 08 01:14:22 c8101c7 ModemManager[1315]: [modem0] 3GPP Registration state changed (roaming -> registering)
Jun 08 01:14:22 c8101c7 ModemManager[1315]: [modem0] 3GPP Registration state changed (registering -> home)

mmcli -m 0 --set-power-state-low

"error: couldn't set new power state in the modem: 'GDBus.Error:org.freedesktop.ModemManager1.Error.Core.WrongState: Cannot set power state: not in disabled state'"

mmcli -m 0 -d
After a long time:

"error: couldn't disable the modem: 'Timeout was reached'"

3 more requests

  1. Yes, on a working device, I get a bearer number. I got the same bearer number 20 minutes later.

mmcli -b BearerNumber

  ------------------------------------
  General            |      dbus path: /org/freedesktop/ModemManager1/Bearer/2
                     |           type: default
  ------------------------------------
  Status             |      connected: yes
                     |      suspended: no
                     |      interface: wwan0
                     |     ip timeout: 20
  ------------------------------------
  Properties         |            apn: wireless.twilio.com
                     |        roaming: allowed
                     |        ip type: ipv4v6
  ------------------------------------
  IPv6 configuration |         method: static
                     |        address: ---
                     |         prefix: 64
                     |        gateway: ---
                     |            dns: ---
                     |            mtu: 1500

qmicli -d /dev/cdc-wdm0 -p --wds-get-profile-list=3gpp
The APNs in the output from this command do not match the cellular file. Here are the APNs listed.

Profile list retrieved:
        [1] 3gpp - 
                APN: 'fast.t-mobile.com'
        [2] 3gpp - 
                APN: 'ims'
        [3] 3gpp - 
                APN: 'sos'
        [4] 3gpp - 
                APN: 'tmus'

If this needs to be fixed, how can I fix it?

Hi @dstewart,

Thanks for the additional diagnostic information. When you say “working device” I’m assuming that to mean a balenaFin + Quectel EC25-G. That is, I’m assuming that it is one of your test devices that you recently rebooted and that successfully connected via cellular. Did I make the correct assumption?

Regarding Quectel EC25-G firmware, a colleague reported improved performance with version EG25GGBR07A08M2G_01.001.01.001. That was in March, so you may already have that version or later. As you probably know, you can check the version# with mmcli -m 0 --command="ATI". It’s certainly possible that this is a firmware issue.

Unfortunately, nothing jumps out at me from the diagnostic information that you helpfully provided. I also reached out to two colleagues to see if they spot anything. The first colleague didn’t see anything. I’ll let you know what the 2nd colleague says.

I also asked a colleague that’s on the balenaFin team. He reports that the Quectel EC25-G is a good choice. In fact, he has one, and it works reliably.

I have a balenaFin and a Quectel EC25-AF. Which has worked well for me. I just powered it on and I will let it run for a few days to see if it reproduces your issue.

I’ll let you know what I find out from colleagues and from the test.

Hi @dstewart,

Here is an update and a question for you. First the question:

  • Any news on the firmware for your EC25-G? As you noted, older firmware can be an issue.

The updates:

  • My colleagues with modem expertise are unfortunately puzzled by the symptoms. Partially because the EC25-G is thought to be a good modem generally.

  • There is a newer version of ModemManager. We will creating a Fin dev image with an updated ModemManager. I’ll be testing that here with my Fin + Quectel EC25-A. Would you like to try it as well?

Hello @dstewart ,

You can take a look at the end of this post for information on how to clear the modem profiles. It might be a good idea to try that and see if helps solve the issue.

Cheers,

Hi again @dstewart

Have you had any progress with your tests of the EC25-G modem? Any luck on upgrading the firmware or cleaning up the modem profiles as per Nico’s suggestion?

Let us know if you have any breakthroughs. We are still trying to replicate on our side.

Hi @dstewart,

Any luck on upgrading the EG25-G firmware or clearing the modem profiles as per Nico’s suggestion above?

Sorry for the late reply. Actually, after resetting the EG25-G a second time a couple weeks ago, it connected again and has stayed connected ever since. The EG21-G also connected on its own without any known power cycles. I’m not sure what caused the initial connection instability, but it seems to be working very well now.

@dstewart, thank you for the good news! Glad to hear that it is working well now.