Questions about heartbeat/offline status

#1

Hello, I’m seeing an alert that 2 devices are offline and that 3 were added in the last 24 hours. In reality, 2 devices were added last week, one was added today, and only 1 should be offline. Is the bug that the last updated timestamp on one of the other 2 devices is wrong or is the offline alert inaccurate, or am I misunderstanding something? My concern is I might legitimately have a second offline device and not realize it, but I don’t think that’s the case.

Additionally, how often is the device expected to heartbeat in production mode? It looks like one device is updating hourly while another can go many hours (8+) between a heartbeat, if the last updated timestamp is to be believed.

Thanks!

#2

Hi Neil,
As far as I know the timestamp should be updated based on the UpdatePollIntervalSeconds value. Unless you use different settings for different devices, I would expect them all to check in at the same frequency. Of course that can fail depending on the networking environment of the devices so that would be the first thing I would check; ie do all the devices have always-on connections or is it possible that they simply cannot reach the server for some reason.

@michaelatmender may have more input on what the displayed values mean and if there are issues.

Drew

#3

Thanks for the response, Drew. I’m using mender-convert, which I assume uses the default 1800 seconds for UpdatePollIntervalSeconds.

Looking at my dashboard, it now says all “3 devices may be offline” (still says 2 were added in last 24 hours). The last updated timestamps on all 3 are 4/25 19:42, 4/19 17:00 (expected offline), and 4/25 20:11 (all times in CT), meaning within the last 24 hours but still over 6 hours old. If I expand device details, the last updated time stamp changes to a much earlier timestamp and then changes back when details are collapsed. It looks like this is a caching bug in UI that is probably propagating the warnings, but it makes it hard to trust any of the information on the dashboard.

#4

Hi @neil, there’s a known issue with timestamps not displaying reliably when clicking to expand - I think a fix should be in https://github.com/mendersoftware/gui/pull/457, so it shouldn’t be long before it’s merged.
We’ll take a look at the discrepancies between ‘added in 24 hrs’ vs ‘may be offline’. Which version are you using - are you using Hosted Mender?

#5

Yup, I’m using hosted mender. Thanks for taking a look at this!