[BUG] mender-update hangs after download-resume gives up, never reports Failure, never returns to poll loop (5.0.3)

Summary

On mender 5.0.3 (C++ client, `mender-updated`), a network outage **during the artifact download phase** leads to a permanent wedge. After the HTTP download resumer exhausts its retries and raises `DownloadResumerError`, the client **hangs**: it does not report the deployment as `Failure`, does not return to the poll loop, and stops emitting any log output. The device looks “stuck updating” indefinitely and does **not** self-recover even after the network returns. Only `systemctl restart mender-updated` (or a reboot) clears it.

This is related to but distinct from Client does not resume download and hangs indefinitely when server connection is severed . In that report the TCP socket stays “established” so the read never fails and resume never triggers. In our case the read *does* fail, resume *does* trigger and retries normally, **then gives up**, and the hang happens *after* the give-up.

Environment

  • Client: mender 5.0.3 (`Running Mender client 5.0.3`), C++ client (`mender-update` / `mender-updated`)
  • Platform: embedded Linux (Yocto-based), eth0
  • Config: `UpdatePollIntervalSeconds=30`, `RetryPollIntervalSeconds=300`, `RetryDownloadCount` left at default (10)

Steps to reproduce

  1. Start a deployment and let it enter the download phase (`Download_Enter`).
  2. ~10 s to 30 s after download starts, cut the network so DNS/connection fails (physically unplug eth0). A real link-down makes DNS fast-fail every 60 s;
  3. Leave the network down long enough to exhaust the resume retries
  4. Restore the network.

Observed behaviour

  • Resume retries log normally (`name=“http_resumer:client”`), then: `Giving up on resuming the download: Tried maximum number of times: Exponential backoff`
  • **After that line, `mender-updated` logs nothing** in our test, ~66 min of total silence, including ~3 min after the link was restored.
  • The service stays `active (running)` but is wedged: no Failure report to the server, no further polling, no `Sync_Enter`. No self-recovery when the network returns.
  • `systemctl restart mender-updated` recovers it **instantly**: it immediately sends a status update, logs `Deployment … finished with status: Failure`, and resumes `No update available` polling.

The fact that a plain restart fixes it (with no on-disk state change) indicates only the **in-memory deployment state machine** is stuck, not any persistent state.

Expected behaviour

When the download resumer gives up (`DownloadResumerError`), the client should follow the same path as any other download failure, transition to ArtifactFailure / report deployment `status: Failure` and return to the Idle/Sync poll loop rather than hanging.

Likely location

The resumer itself behaves correctly: `src/common/http/http_resumer.cpp` (`DownloadResumerClient::ScheduleNextResumeRequest()`) raises the `http::DownloadResumerError` once `ExponentialBackoff(chrono::minutes(1), config.retry_download_count)` is exhausted. The defect appears to be in the **caller** (the deployment state machine’s `Download` / `ArtifactDownload` handler) which doesn’t consume that error to drive a state transition, so the deployment dead-ends instead of failing cleanly.