Losing metrics with the prometheus pushgateway

how to properly push metrics to prometheus pushgateway

At work, we use Prometheus together with its pushgateway to monitor and alert on backup job execution. The other day we noticed a failing backup job that did not trigger an alert. Debugging quickly revealed that the pushgateway was losing metrics. Our metrics look like this: backup_last_success_unixtime{instance_name="some_hostname", job="backup_job"} We have several jobs with the same job label, running on different hosts, thus I thought these would be recorded as different timeseries.