QEMU devices with management issues

ab1 · October 29, 2017, 1:18pm

Hi, we have a couple of devices below, with containers running, but one of them doesn’t seem to be logging anything and both are not accessible via Resin.io TTY:

apps/307727/devices/763172
apps/307727/devices/519904

They are both on Resin OS 2.3.0+rev1 (prod) sup: 6.1.3.

Is it possible to check from your end what’s happening in the host OS?

Both devices are running on QEMU network bridges and the iptables config is container default.

– ab1

izavits · October 30, 2017, 10:44am

Hello,
the first device appears offline. Is it offline on purpose?

Are those devices in production? Is it ok if we try to access the device?

ab1 · October 30, 2017, 11:04am

They are live, but the user must have taken it offline.

Perhaps you could take a look at TTY access on apps/307727/devices/519904?

izavits · October 30, 2017, 12:04pm

I see that the session is disconnected as soon as the terminal tries to start. I wonder if there’s something in that network blocking the access to the web terminal.

ab1 · October 30, 2017, 12:15pm

The container has full access to the network as far as I can see, but no traffic is going across resin-vpn interface.

ab1 · October 30, 2017, 9:58pm

OK, I’ve managed to fix one of the devices (it was the firewall).

The remaining one /apps/307727/devices/763172/summary seems to have a problem with the supervisor not running (nothing listening on port 48484):

# ifconfig docker0
docker0   Link encap:Ethernet  HWaddr 02:42:cd:e0:aa:0c
          inet addr:10.114.101.1  Bcast:0.0.0.0  Mask:255.255.255.0
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

# ping 10.114.101.1 -c 1
PING 10.114.101.1 (10.114.101.1) 56(84) bytes of data.
64 bytes from 10.114.101.1: icmp_seq=1 ttl=64 time=0.134 ms
...

# telnet 10.114.101.1 48484
Trying 10.114.101.1...
telnet: Unable to connect to remote host: Connection refused

# telnet 127.0.0.1 48484
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection refused

firewall rules seem in order:

*filter
:INPUT ACCEPT [469:80190]
:FORWARD ACCEPT [17:1023]
:OUTPUT ACCEPT [450:48818]
:DOCKER - [0:0]
:DOCKER-ISOLATION - [0:0]
-A INPUT -i resin-vpn -p tcp -m tcp --dport 48484 -j ACCEPT
-A INPUT -i tun0 -p tcp -m tcp --dport 48484 -j ACCEPT
-A INPUT -i docker0 -p tcp -m tcp --dport 48484 -j ACCEPT
-A INPUT -i lo -p tcp -m tcp --dport 48484 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 48484 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -j DOCKER-ISOLATION
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A DOCKER-ISOLATION -j RETURN
COMMIT

Any ideas?

izavits · October 31, 2017, 8:37am

we are taking a look and we’ll let you know. I may have to reach to the team for more suggestions on this matter.
Thanks,
ilias

izavits · October 31, 2017, 10:11am

how much RAM are you assigning to this device?

ab1 · October 31, 2017, 11:10am

             total       used       free     shared    buffers     cached
Mem:           992        687        305         16         12        529
-/+ buffers/cache:        144        847
Swap:            0          0          0```

Not enough?

izavits · October 31, 2017, 11:17am

Do the other devices that you have in the same application have the same configurations?

ab1 · October 31, 2017, 11:22am

This one /apps/307727/devices/826833/summary is more or less the same as far as I can see and is working.

Are you able to get onto the host OS on the problematic device?

Is the supervisor running?

izavits · October 31, 2017, 11:26am

The supervisor is not running, but it is not a supervisor-related problem. The fact that is not running is probably due to other factors but we are not sure yet. We are thinking of filesystem corruptions because we see lots of I/O errors but we are still investigating.

ab1 · October 31, 2017, 11:27am

It is possible - I have no direct access to the device and the person who has isn’t responding. So if you can’t do anything from your end, let’s just forget about it and when (if) they get in touch with me, I’ll tell them to rebuild.

Topic		Replies	Views
Device status ONLINE but cannot connect, update, read log Product support	10	771	December 29, 2017
No supervisor comms under QEMU Product support	2	548	September 13, 2017
Device not sending info to resin.io but my app works Product support	11	1408	May 29, 2017
Dashboard shows my devices as offline, although the apps running in the container is working and sending data to the cloud Product support	3	1683	August 31, 2017
Unable to access device via resin ssh Product support	2	644	June 6, 2017

QEMU devices with management issues

Related topics