Running check-update returns a 502 Bad Gateway

Hello,

Our device is a yocto kirkstone build with MENDER_UPDATE_POLL_INTERVAL_SECONDS set to 315569520 so we only run updates manually. We are also running Mender server 3.4 on GKE Autopilot and hosting the artifacts on a Cloud Storage bucket.

After a device is accepted, running bootstrap and send-inventory from the command-line work fine. However, running check-update returns a 502 Bad Gateway error.

The full log is:

Dec 07 07:36:16 k1272 mender[1111]: time="2022-12-07T07:36:16Z" level=info msg="Forced wake-up from sleep"
Dec 07 07:36:16 k1272 mender[1111]: time="2022-12-07T07:36:16Z" level=info msg="Forced wake-up from sleep"
Dec 07 07:36:16 k1272 mender[1111]: time="2022-12-07T07:36:16Z" level=info msg="Forcing state machine to: update-check"
Dec 07 07:36:16 k1272 mender[1111]: time="2022-12-07T07:36:16Z" level=info msg="State transition: check-wait [Idle] -> update-check [Sync]"
Dec 07 07:36:16 k1272 mender[1111]: time="2022-12-07T07:36:16Z" level=warning msg="Returning artifact name from /etc/mender/artifact_info file. This is a fallback, in case the information can not be retrieved from the database, and is only expected when an update has never been installed before."
Dec 07 07:36:16 k1272 mender[1111]: time="2022-12-07T07:36:16Z" level=info msg="Forcing state machine to: update-check"
Dec 07 07:36:16 k1272 mender[1111]: time="2022-12-07T07:36:16Z" level=info msg="State transition: check-wait [Idle] -> update-check [Sync]"
Dec 07 07:36:16 k1272 mender[1111]: time="2022-12-07T07:36:16Z" level=warning msg="Returning artifact name from /etc/mender/artifact_info file. This is a fallback, in case the information can not be retrieved from the database, and is only expected when an update has never been installed before."
Dec 07 07:36:47 k1272 mender[1111]: time="2022-12-07T07:36:47Z" level=error msg="Error receiving scheduled update data: failed to check update info on the server. Response: &{502 Bad Gateway 502 HTTP/1.1 1 1 map[Alt-Svc:[h3=\":443\"; ma=2592000,h3-29=\":443\"; ma=2592000] Content-Length:[332] Content-Type:[text/html; charset=UTF-8] Date:[Wed, 07 Dec 2022 07:36:47 GMT] Referrer-Policy:[no-referrer]] 0x1526790 332 [] false false map[] 0x1624280 <nil>}"
Dec 07 07:36:47 k1272 mender[1111]: time="2022-12-07T07:36:47Z" level=error msg="Update check failed: transient error: failed to check update info on the server. Response: &{502 Bad Gateway 502 HTTP/1.1 1 1 map[Alt-Svc:[h3=\":443\"; ma=2592000,h3-29=\":443\"; ma=2592000] Content-Length:[332] Content-Type:[text/html; charset=UTF-8] Date:[Wed, 07 Dec 2022 07:36:47 GMT] Referrer-Policy:[no-referrer]] 0x1526790 332 [] false false map[] 0x1624280 <nil>}"
Dec 07 07:36:47 k1272 mender[1111]: time="2022-12-07T07:36:47Z" level=error msg="Error receiving scheduled update data: failed to check update info on the server. Response: &{502 Bad Gateway 502 HTTP/1.1 1 1 map[Alt-Svc:[h3=\":443\"; ma=2592000,h3-29=\":443\"; ma=2592000] Content-Length:[332] Content-Type:[text/html; charset=UTF-8] Date:[Wed, 07 Dec 2022 07:36:47 GMT] Referrer-Policy:[no-referrer]] 0x1526790 332 [] false false map[] 0x1624280 <nil>}"
Dec 07 07:36:47 k1272 mender[1111]: time="2022-12-07T07:36:47Z" level=error msg="Update check failed: transient error: failed to check update info on the server. Response: &{502 Bad Gateway 502 HTTP/1.1 1 1 map[Alt-Svc:[h3=\":443\"; ma=2592000,h3-29=\":443\"; ma=2592000] Content-Length:[332] Content-Type:[text/html; charset=UTF-8] Date:[Wed, 07 Dec 2022 07:36:47 GMT] Referrer-Policy:[no-referrer]] 0x1526790 332 [] false false map[] 0x1624280 <nil>}"
Dec 07 07:36:47 k1272 mender[1111]: time="2022-12-07T07:36:47Z" level=info msg="State transition: update-check [Sync] -> error [Error]"
Dec 07 07:36:47 k1272 mender[1111]: time="2022-12-07T07:36:47Z" level=info msg="State transition: update-check [Sync] -> error [Error]"
Dec 07 07:36:47 k1272 mender[1111]: time="2022-12-07T07:36:47Z" level=info msg="Handling error state, current error: transient error: failed to check update info on the server. Response: &{502 Bad Gateway 502 HTTP/1.1 1 1 map[Alt-Svc:[h3=\":443\"; ma=2592000,h3-29=\":443\"; ma=2592000] Content-Length:[332] Content-Type:[text/html; charset=UTF-8] Date:[Wed, 07 Dec 2022 07:36:47 GMT] Referrer-Policy:[no-referrer]] 0x1526790 332 [] false false map[] 0x1624280 <nil>}"
Dec 07 07:36:47 k1272 mender[1111]: time="2022-12-07T07:36:47Z" level=info msg="State transition: error [Error] -> idle [Idle]"
Dec 07 07:36:47 k1272 mender[1111]: time="2022-12-07T07:36:47Z" level=info msg="Handling error state, current error: transient error: failed to check update info on the server. Response: &{502 Bad Gateway 502 HTTP/1.1 1 1 map[Alt-Svc:[h3=\":443\"; ma=2592000,h3-29=\":443\"; ma=2592000] Content-Length:[332] Content-Type:[text/html; charset=UTF-8] Date:[Wed, 07 Dec 2022 07:36:47 GMT] Referrer-Policy:[no-referrer]] 0x1526790 332 [] false false map[] 0x1624280 <nil>}"
Dec 07 07:36:47 k1272 mender[1111]: time="2022-12-07T07:36:47Z" level=info msg="State transition: idle [Idle] -> check-wait [Idle]"
Dec 07 07:36:47 k1272 mender[1111]: time="2022-12-07T07:36:47Z" level=info msg="State transition: error [Error] -> idle [Idle]"
Dec 07 07:36:47 k1272 mender[1111]: time="2022-12-07T07:36:47Z" level=info msg="State transition: idle [Idle] -> check-wait [Idle]"

Thanks and best regards,
Amged

Hello Amged :wave:

Have you verified that all pods are healthy? I suspect that the deployments pods might not be running. If so can you inspect the logs and see if you can find any hints?
Also, could you also check that the ingress is properly configured and that the server URL used by the client is correctly routed to the Traefik gateway service?

Best regards,
Alf-Rune