Error: grpc: the connection is unavailable

Hello, today I got builder error during resin push.

[Error] Some services failed to build:
[Error] Service: main
[Error] Error: grpc: the connection is unavailable
[Error] Not deploying release.

But after a few tries it was successfully built and uploaded.
Again as mentioned in my last post (Full app downoad after "Error: Builder disconnected" message) the app was downloaded by device fully, not only upgrded part

I am getting this also with my own projects and sample project https://github.com/influxdata/resin-influx

Seems like there are some issues although nothing showing on https://status.resin.io/

Thanks for the reports, we’re looking into it

Thanks again for the reports.

Our team has been able to locate and look into the transient GRPC communication issue which occurred on one of our builder instances from 22:00 to 13:00 BST. As it only occurred on one of our build instances it only affected a % of all builds, and fortunately repeat builds seem to have gone through during that timeframe.

We have resolved the issue, and will continue to work to prevent repetition.

Hey chrischabot, the same issue is happening to me today

[main] grpc: the connection is unavailable

Broken for me too on arm01.
Works ok when arm02 builds it.

I am encountering this error when trying to connect to a container via SSH on one of our devices currently. Can’t say for sure if it is device specific or not yet. But I suspect its a different issue, since the device is generally doing weird stuff, like randomly restarting all containers, currently.

Happening to me today. Interesting that the first post was exactly a year ago - is there some pattern there?

I am getting the same issue repeatedly when building with arm01

Today I got this message again. Is April 18 a “magic date”?
Looking on masseges in this thread. Seams, every 18th day of each month could be a problem

UPD:
yesterday I got failed build with this message:
“E: Some index files failed to download. They have been ignored, or old ones used instead.”
I have tried to build the same app on another balena account - and it was successfull.

Today i have gprc error on my account

1 Like

I’m having this issue right now also. Here’s my logs, including the dockerfile as sent to me by Balena engineering:

[Info]     Starting build for wifi-eus-test, user XXXXXXXXXXXXXXXX
[Info]     Dashboard link: https://dashboard.balena-cloud.com/apps/XXXXXXX/devices
[Info]     Building on arm03
[Info]     Pulling previous images for caching purposes...
[Success]  Successfully pulled cache images
[main]     Step 1/6 : FROM balenalib/raspberrypi3-debian
[main]      ---> 6dafa0cd68d7
[main]     Step 2/6 : RUN apt-get update && apt-get install -y curl wget build-essential libelf-dev awscli bc flex libssl-dev python pkg-config git hostapd libnl-3-dev libnl-genl-3-dev
[main]      ---> Running in 8509fc6ac33a
[main]     Removing intermediate container 8509fc6ac33a
[Info]     Uploading images
[main]     grpc: the connection is unavailable
[Success]  Successfully uploaded images
[Error]    Some services failed to build:
[Error]      Service: main
[Error]        Error: grpc: the connection is unavailable
[Error]    Not deploying release.

UPDATE: If I comment out step 2 (RUN apt-get update && apt-get install), the build will complete. I added another RUN statement to the Dockerfile, and the build failed on that step, too. Perhaps there is something wrong with the RUN statements?

1 Like

Having it all night now as well. Docker files haven’t changed, but something on the platform has definitely gone bad…

Yep, also running into build problems with our containers. Was able to build some of our services locally though.

Hey @mpfluger, @letrich, @jayatrbt, @ejohnso49

I’m so sorry for the inconvenience. Our team has been able to locate and look into the issue. In our builders, docker-containerd process fails, and the parent process (dockerd) fails to detect the failure or respawn the child process, resulting in this rpc error when trying to reach it

We have resolved the issue, and will continue to work to prevent repetition.

2 Likes

I am seeing this currently.

...
[broker]   Successfully built a6eaa9e4610c
[gateway]   ---> Running in 23ba3eb2f154
[gateway]  Removing intermediate container 23ba3eb2f154
[gateway]  grpc: the connection is unavailable
[app]      Step 1/9 : FROM wlisac/generic-armv7ahf-swift:4.2.3
[app]       ---> 6957cfcc32ff
[app]      Step 2/9 : RUN install_packages libmosquitto-dev
[app]       ---> Running in 1fa301ec83ea
[app]      Removing intermediate container 1fa301ec83ea
[Info]     Uploading images
[app]      grpc: the connection is unavailable
[Success]  Successfully uploaded images
[Error]    Some services failed to build:
[Error]      Service: app
[Error]        Error: grpc: the connection is unavailable
[Error]      Service: undefined
[Error]        Error: Information not available
[Error]      Service: gateway
[Error]        Error: grpc: the connection is unavailable
[Error]    Not deploying release.
Remote build failed
...
1 Like

I’m seeing this now, too.

…
[Info]     Building on arm01
[Info]     Pulling previous images for caching purposes...
[Success]  Successfully pulled cache images
[main]     Step 1/8 : FROM wlisac/raspberrypi3-swift:5.0.1-build AS build
[main]      ---> 9a965840315a
[main]     Step 2/8 : WORKDIR /app
[main]      ---> 6732b373b67d
[main]     Removing intermediate container 4cfd938ec67d
[main]     Step 3/8 : COPY . ./
[main]      ---> 399a6fb4e975
[main]     Removing intermediate container b20299fea4a1
[main]     Step 4/8 : RUN swift build
[main]      ---> Running in fb49cb372a74
[main]     Removing intermediate container fb49cb372a74
[Info]     Uploading images
[main]     grpc: the connection is unavailable
[Success]  Successfully uploaded images
[Error]    Some services failed to build:
[Error]      Service: main
[Error]        Error: grpc: the connection is unavailable
[Error]    Not deploying release.

I just had a successful build of the same project on arm03. Maybe it’s specific to arm01?

Hey @wlisac thanks for reporting, I’m investigating arm01 now, but if you retry and get a different builder it should be working ok.

1 Like

Taking a quick look, it seems to be a bug in the docker daemon (or one of it’s supporting modules, such as containerd). A restart of the daemon does fix things, but we’re investigating the root cause too. You should be able to push to any arm host now. Thanks again for the report!

1 Like

Thanks @CameronDiver – looks good now. And thanks to @incanus for noticing it first. :slight_smile:

1 Like