Docker-compose network service names not resolving correctly

Hi,

I’m having some issues that seem to be specific with the way Balena interprets and deploys docker-compose services.

I basically have 2 services, on the same bridge network, and I try to access one container from another by using its service name:

version: '2.1'

networks:
  localnet:
    driver: bridge

services:
  cloud-interface:
    build: .
    networks:
      - localnet

  local-broker:
    build: localmqtt
    restart: always
    ports:
      - 1883:1883
    networks:
      - localnet

On the example above, my cloud-interface service, a simple Go application that tries to connect to the local MQTT broker, will use tcp://local-broker:1883 as the broker host endpoint.

The error I get is:
Network Error : dial tcp: lookup local-broker on 127.0.0.11:53: no such host

If I deploy the exact same stack on my laptop, everything works as expected.

Any ideas?

thank you,
nelson

Hi @nelson, have you tried deploying this without the networks configuration? IIRC it should work fine in the scenario you’ve described without that.

hi @chrisys,

yes, it works without network and it also works if I use network_mode: host on both services.

What I want is really to use networks though! I have many services running, multiple brokers, etc… and want to use networks to keep services isolated.

Are networks not supported by Balena at the moment?

thank you,
nelson

Hi @nelson
it looks like there might have been an issue in an earlier versions of Balena. What OS version are you experiencing this problem on ?
Regards
Thomas

hi @samothx,

the devices are running the latest OS for the RPi: balenaOS 2.29.2+rev2

thanks,
nelson

Hi there,

Just to keep you updated, one of our engineers will be looking at the issue soon (probably this week), and we will inform you once the problem has been investigated and resolved.

Regards,
Steve

Thanks for the update @sradevski, looking forward to feedback on this!

cheers,
nelson

This issue is not resolved yet, but just to share what I believe to be the relevant github issue and pull request so they can be followed:

We’ll also post an update here when the work is concluded.

Hey,

I seem to have the same issue with BalenaOS64 for Raspberry pi3. I tried changing the network mode to host, but the error continues.

Any idea how to solve it, albeit temporarily? @nelson

During the setup phase of consul, I get this error

ERROR: 2019/04/07 17:28:36 Get http://edgex-core-consul:8500/v1/agent/self: dial tcp: lookup edgex-core-consul on 127.0.0.11:53: no such host

Thank you for your time guys!

Best,
Odysseas

The docker-compose that I use, is the following:

# /*******************************************************************************
#  * Copyright 2018 Dell Inc.
#  *
#  * Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except
#  * in compliance with the License. You may obtain a copy of the License at
#  *
#  * http://www.apache.org/licenses/LICENSE-2.0
#  *
#  * Unless required by applicable law or agreed to in writing, software distributed under the License
#  * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
#  * or implied. See the License for the specific language governing permissions and limitations under
#  * the License.
#  *
#  * @author: Jim White, Dell
#  * EdgeX Foundry, Delhi, version 0.7.1
#  * added: Dec 10, 2018
#  *******************************************************************************/

version: '2.1'
volumes:
  db-data:
  log-data:
  consul-config:
  consul-data:
  # portainer_data:

services:

  volume:
    image: edgexfoundry/docker-edgex-volume-arm64:0.8.0
    #   container_name: edgex-files
    networks:
      - edgex-network
    volumes:
      - db-data:/data/db
      - log-data:/edgex/logs
      - consul-config:/consul/config
      - consul-data:/consul/data
      
  consul:
    image: consul:1.4.0
    ports:
      - "8400:8400"
      - "8500:8500"
      - "8600:8600"
#   container_name: edgex-core-consul
    hostname: edgex-core-consul
    networks:
      edgex-network:
        aliases:
            - edgex-core-consul
    volumes:
      - db-data:/data/db
      - log-data:/edgex/logs
      - consul-config:/consul/config
      - consul-data:/consul/data
    depends_on:
      - volume  

  config-seed:
    image: edgexfoundry/docker-core-config-seed-go-arm64:0.7.1
#   container_name: edgex-config-seed
    hostname: edgex-core-config-seed
    networks:
      edgex-network:
        aliases:
            - edgex-core-config-seed
    volumes:
      - db-data:/data/db
      - log-data:/edgex/logs
      - consul-config:/consul/config
      - consul-data:/consul/data
    depends_on:
      - volume
      - consul
      
  mongo:
    image: edgexfoundry/docker-edgex-mongo-arm64:0.8.0
    ports:
      - "27017:27017"
#    container_name: edgex-mongo
    hostname: edgex-mongo
    networks:
      - edgex-network
    volumes:
      - db-data:/data/db
      - log-data:/edgex/logs
      - consul-config:/consul/config
      - consul-data:/consul/data
    depends_on:
      - volume

  logging:
    image: edgexfoundry/docker-support-logging-go-arm64:0.7.1
    ports:
      - "48061:48061"
#    container_name: edgex-support-logging
    hostname: edgex-support-logging
    networks:
      - edgex-network
    volumes:
      - db-data:/data/db
      - log-data:/edgex/logs
      - consul-config:/consul/config
      - consul-data:/consul/data
    depends_on:
      - config-seed
      - mongo
      - volume

  notifications:
    image: edgexfoundry/docker-support-notifications-go-arm64:0.7.1
    ports:
      - "48060:48060"
#    container_name: edgex-support-notifications
    hostname: edgex-support-notifications
    networks:
      - edgex-network
    volumes:
      - db-data:/data/db
      - log-data:/edgex/logs
      - consul-config:/consul/config
      - consul-data:/consul/data
    depends_on:
      - logging

  metadata:
    image: edgexfoundry/docker-core-metadata-go-arm64:0.7.1
    ports:
      - "48081:48081"
#    container_name: edgex-core-metadata
    hostname: edgex-core-metadata
    networks:
      - edgex-network
    volumes:
      - db-data:/data/db
      - log-data:/edgex/logs
      - consul-config:/consul/config
      - consul-data:/consul/data
    depends_on:
      - logging

  data:
    image: edgexfoundry/docker-core-data-go-arm64:0.7.1
    ports:
      - "48080:48080"
      - "5563:5563"
#    container_name: edgex-core-data
    hostname: edgex-core-data
    networks:
      - edgex-network
    volumes:
      - db-data:/data/db
      - log-data:/edgex/logs
      - consul-config:/consul/config
      - consul-data:/consul/data
    depends_on:
      - logging

  command:
    image: edgexfoundry/docker-core-command-go-arm64:0.7.1
    ports:
      - "48082:48082"
#    container_name: edgex-core-command
    hostname: edgex-core-command
    networks:
      - edgex-network
    volumes:
      - db-data:/data/db
      - log-data:/edgex/logs
      - consul-config:/consul/config
      - consul-data:/consul/data
    depends_on:
      - metadata

  scheduler:
    image: edgexfoundry/docker-support-scheduler-go-arm64:0.7.1
    ports:
      - "48085:48085"
#    container_name: edgex-support-scheduler
    hostname: edgex-support-scheduler
    networks:
      - edgex-network
    volumes:
      - db-data:/data/db
      - log-data:/edgex/logs
      - consul-config:/consul/config
      - consul-data:/consul/data
    depends_on:
      - metadata

  export-client:
    image: edgexfoundry/docker-export-client-go-arm64:0.7.1
    ports:
      - "48071:48071"
#    container_name: edgex-export-client
    hostname: edgex-export-client
    networks:
      - edgex-network
    volumes:
      - db-data:/data/db
      - log-data:/edgex/logs
      - consul-config:/consul/config
      - consul-data:/consul/data
    depends_on:
      - data
    environment:
      - EXPORT_CLIENT_MONGO_URL=edgex-mongo
      - EXPORT_CLIENT_DISTRO_HOST=export-distro
      - EXPORT_CLIENT_CONSUL_HOST=edgex-config-seed

  export-distro:
    image: edgexfoundry/docker-export-distro-go-arm64:0.7.1
    ports:
      - "48070:48070"
      - "5566:5566"
 #   container_name: edgex-export-distro
    hostname: edgex-export-distro
    networks:
      - edgex-network
    volumes:
      - db-data:/data/db
      - log-data:/edgex/logs
      - consul-config:/consul/config
      - consul-data:/consul/data
    depends_on:
      - export-client
    environment:
      - EXPORT_DISTRO_CLIENT_HOST=export-client
      - EXPORT_DISTRO_DATA_HOST=edgex-core-data
      - EXPORT_DISTRO_CONSUL_HOST=edgex-config-seed
      - EXPORT_DISTRO_MQTTS_CERT_FILE=none
      - EXPORT_DISTRO_MQTTS_KEY_FILE=none

  rulesengine:
    image: edgexfoundry/docker-support-rulesengine:0.7.0
    ports:
      - "48075:48075"
 #   container_name: edgex-support-rulesengine
    hostname: edgex-support-rulesengine
    networks:
      - edgex-network
    volumes:
      - db-data:/data/db
      - log-data:/edgex/logs
      - consul-config:/consul/config
      - consul-data:/consul/data


#################################################################
# Device Services
#################################################################

  device-virtual:
    image: edgexfoundry/docker-device-virtual:0.6.0
    ports:
      - "49990:49990"
#    container_name: edgex-device-virtual
    hostname: edgex-device-virtual
    networks:
      - edgex-network
    volumes:
      - db-data:/data/db
      - log-data:/edgex/logs
      - consul-config:/consul/config
      - consul-data:/consul/data
    depends_on:
      - data
      - command

  device-random:
    image: edgexfoundry/docker-device-random-go-arm64:0.7.1
    ports:
      - "49988:49988"
#    container_name: edgex-device-random
    hostname: edgex-device-random
    networks:
      - edgex-network
    volumes:
      - db-data:/data/db
      - log-data:/edgex/logs
      - consul-config:/consul/config
      - consul-data:/consul/data
    depends_on:
      - data
      - command

#  device-mqtt:
#    image: edgexfoundry/docker-device-mqtt-go-arm64:0.7.1
#    ports:
#      - "49982:49982"
#    container_name: edgex-device-mqtt
#    hostname: edgex-device-mqtt
#    networks:
#      - edgex-network
#    volumes:
#      - db-data:/data/db
#      - log-data:/edgex/logs
#      - consul-config:/consul/config
#      - consul-data:/consul/data
#    depends_on:
#      - data
#      - command

#  device-modbus:
#    image: edgexfoundry/docker-device-modbus-go-arm64:0.7.1
#    ports:
#      - "49991:49991"
#    container_name: edgex-device-modbus
#    hostname: edgex-device-modbus
#    networks:
#      - edgex-network
#    volumes:
#      - db-data:/data/db
#      - log-data:/edgex/logs
#      - consul-config:/consul/config
#      - consul-data:/consul/data
#    depends_on:
#      - data
#      - command

#   device-bluetooth:
#     image: nexus3.edgexfoundry.org:10004/docker-device-bluetooth:0.6.0
#     ports:
#       - "49988:49988"
#       - "5000:5000"
#     container_name: edgex-device-bluetooth
#     hostname: edgex-device-bluetooth
#     privileged: true  
#     network_mode: "host"
#     cap_add:
#       - NET_ADMIN
# #    networks:
# #      - edgex-network
#    volumes:
#      - db-data:/data/db
#      - log-data:/edgex/logs
#      - consul-config:/consul/config
#      - consul-data:/consul/data
#     depends_on:
#       - data
#       - command

#   device-snmp:
#     image: nexus3.edgexfoundry.org:10004/docker-device-snmp:0.6.0
#     ports:
#       - "49989:49989"
#     container_name: edgex-device-snmp
#     hostname: edgex-device-snmp
#     networks:
#       - edgex-network
#    volumes:
#      - db-data:/data/db
#      - log-data:/edgex/logs
#      - consul-config:/consul/config
#      - consul-data:/consul/data
#     depends_on:
#       - data
#       - command

#   device-fischertechnik:
#     image: nexus3.edgexfoundry.org:10004/docker-device-fischertechnik:0.6.0
#     ports:
#       - "49985:49985"
#     container_name: edgex-device-fischertechnik
#     networks:
#       - edgex-network
#    volumes:
#      - db-data:/data/db
#      - log-data:/edgex/logs
#      - consul-config:/consul/config
#      - consul-data:/consul/data
#     privileged: true
#     depends_on:
#       - data
#       - command

#   device-bacnet:
#     image: nexus3.edgexfoundry.org:10004/docker-device-bacnet:0.6.0
#     ports:
#       - "49986:49986"
#       - "5002:5002"
#     container_name: edgex-device-bacnet
#     hostname: edgex-device-bacnet
#     networks:
#       - edgex-network
#    volumes:
#      - db-data:/data/db
#      - log-data:/edgex/logs
#      - consul-config:/consul/config
#      - consul-data:/consul/data
#    depends_on:
#       - data
#       - command    

#################################################################
# UIs
#################################################################
#  ui:
#    image: edgexfoundry/docker-edgex-ui-go:0.1.1
#    ports:
#      - "4000:4000"
#    container_name: edgex-ui-go
#    hostname: edgex-ui-go
#    networks:
#      - edgex-network
#    volumes:
#      - db-data:/data/db
#      - log-data:/edgex/logs
#      - consul-config:/consul/config
#      - consul-data:/consul/data
#    depends_on:
#      - data
#      - command

#################################################################
# Tooling
#################################################################

  # portainer:
  #   image:  portainer/portainer
  #   ports:
  #     - "9000:9000"
  #   command: -H unix:///var/run/docker.sock
  #   volumes:
  #     - /var/run/docker.sock:/var/run/docker.sock
  #     - portainer_data:/data
  #   depends_on:
  #     - volume  
  
networks:
  edgex-network:
    driver: "host"

...

Hi @odys,

Have you tried deploying without the networks configuration? And by adding network_mode: host to all services that require network?

Chris and Nelson confirmed that this works fine for them as a workaround.

And as Paulo noted, we plan to release the actual fix soon. (It needs to go through some more tests internally)

Hey @gelbal,

So, I should both remove the networks configuration as also implement network_mode: host in all services.

Both bridge mode as also networks will be usable after this bug fix? when can we expect it for the Balena64 BETA version?

Thanks!

The fix is already merged so it’s a matter of putting together other fixes to bundle everything in our next OS release. I will tag this forum thread in the relevant issue, so this forum gets pinged when we make the release:

I’d expect the fix to work with bridge mode as well.

Honestly I have not tried to reproduce the issue myself. Then judging from the response from @chrisys, yeah what you explain would work.

Hey,

Just wanted a second opinion on whether I am diagnosing the problem correctly. My diagnose is that due to bug related to container networking , containers can’t find each other. The docker-compose file is posted above and the suite is the EdgeX IoT Platform.

Thanks everybody for your time!

Proof (from the logs):

10.04.19 10:06:58 (+0300)  logging  ERROR: 2019/04/10 07:06:58 connection to Consul could not be made: Put http://edgex-core-consul:8500/v1/agent/service/register: dial tcp: lookup edgex-core-consul on 10.114.102.1:53: no such host
10.04.19 10:06:58 (+0300)  command  ERROR: 2019/04/10 07:06:58 connection to Consul could not be made: Put http://edgex-core-consul:8500/v1/agent/service/register: dial tcp: lookup edgex-core-consul on 10.114.102.1:53: no such host
10.04.19 10:06:58 (+0300)  scheduler  ERROR: 2019/04/10 07:06:58 connection to Consul could not be made: Put http://edgex-core-consul:8500/v1/agent/service/register: dial tcp: lookup edgex-core-consul on 10.114.102.1:53: no such host
10.04.19 10:06:58 (+0300)  export-client  ERROR: 2019/04/10 07:06:58 connection to Consul could not be made: Put http://edgex-core-consul:8500/v1/agent/service/register: dial tcp: lookup edgex-core-consul on 10.114.102.1:53: no such host
10.04.19 10:06:58 (+0300)  metadata  ERROR: 2019/04/10 07:06:58 connection to Consul could not be made: Put http://edgex-core-consul:8500/v1/agent/service/register: dial tcp: lookup edgex-core-consul on 10.114.102.1:53: no such host
10.04.19 10:06:59 (+0300)  data  ERROR: 2019/04/10 07:06:59 connection to Consul could not be made: Put http://edgex-core-consul:8500/v1/agent/service/register: dial tcp: lookup edgex-core-consul on 10.114.102.1:53: no such host
10.04.19 10:06:59 (+0300)  export-distro  ERROR: 2019/04/10 07:06:59 connection to Consul could not be made: Put http://edgex-core-consul:8500/v1/agent/service/register: dial tcp: lookup edgex-core-consul on 10.114.102.1:53: no such host
10.04.19 10:06:59 (+0300)  notifications  ERROR: 2019/04/10 07:06:59 connection to Consul could not be made: Put http://edgex-core-consul:8500/v1/agent/service/register: dial tcp: lookup edgex-core-consul on 10.114.102.1:53: no such host

HI @odys,

default networking between containers works fine (at least in my use case: RPi + balenaOS v2.29.2+rev2). The only issue I had, and what seems to be an issue in general, is using custom networks, to create isolated networking groups.

Like @gelbal said, just use the default bridge network. Remove your custom network from the compose file and get all your containers running on the same default network until the fix is available.

cheers,
nelson

Hey @nelson,

I removed the networks bit as well the network mode. My services still can’t find consul, I suspect hostname might also be bugged as well. Because I use different service names and hostnames.

By get all your containers running on the same default network , what do you mean exactly? Just want to be 100% sure.

What do you think?

QUICK UPDATE: The services are discoverable using the service-name in lieu of the hostname in the address.

Cheers!