Losing metrics with the prometheus pushgateway

how to properly push metrics to prometheus pushgateway

2 minute read

At work, we use Prometheus together with its pushgateway to monitor and alert on backup job execution. The other day we noticed a failing backup job that did not trigger an alert. Debugging quickly revealed that the pushgateway was losing metrics. Our metrics look like this: backup_last_success_unixtime{instance_name="some_hostname", job="backup_job"} We have several jobs with the same job label, running on different hosts, thus I thought these would be recorded as different timeseries.

1 minute read

After more than 3 years, I am reviving my old blog. As my older posts are hardly relevant these days and I don’t have many readers anyways, I just start from scratch. If anyone is interested in any of my older posts, drop me a line and I can republish it.