Manage Prometheus TSDB in the better way!
amirreza valizade
Posted on March 28, 2023
Prometheus is a powerful monitoring system that provides a simple solution to the retention of old data with --storage.tsdb.retention.size
and --storage.tsdb.retention.time
configurations. These configurations allow users to define the maximum size and age of data that should be retained in the Prometheus database. However, in some cases, users may need to store certain metrics for long-term purposes and delete unnecessaries. In this article, we will discuss how to label Prometheus targets and delete old data using the Admin API to meet such requirements.
Labeling Prometheus Targets
To retain specific metrics for a longer period, you need to label the targets from which Prometheus scrapes data. You can add a new label, such as retention_time
, to the job configuration file for each target. The value of this label should represent the duration for which you want to retain the data. For example, you can set the label to "one-month"
, "three-month"
, "twelve-month"
, or any other value that suits your needs.
Here is an example job configuration file that adds the retention_time
label:
- job_name: 'node_exporter'
file_sd_configs:
- files:
- node_exporter.yml
relabel_configs:
- target_label: retention_time
replacement: "one-month"
Deleting labeled Data Using the Admin API
Once you have labeled the targets, you can use the Prometheus Admin API to delete old data that is no longer required. The DeleteSeries
endpoint deletes data for a selection of series in a time range. The data still exists on disk and is cleaned up in future compactions, or you can explicitly clean it up using the CleanTombstones
endpoint. Enable admin api by --web.enable-admin-api
on Prometheus.
To delete data for a particular time range, you can use the match[]
URL query parameter to select the series to delete, along with the start and end timestamps. Here is an example of using DeleteSeries
to delete data for series with retention_time="one-month"
label that are older than one month:
$ curl -X PUT \
-g 'http://localhost:9090/api/v1/admin/tsdb/delete_series?match[]={retention_time="one-month"}&end='"$(date +%s -d '1 month ago')"''
URL query parameters:
-
match[]=<series_selector>
: Repeated label matcher argument that selects the series to delete. At least onematch[]
argument must be provided. -
start=<rfc3339 | unix_timestamp>
: Start timestamp. Optional and defaults to minimum possible time. -
end=<rfc3339 | unix_timestamp>
: End timestamp. Optional and defaults to maximum possible time. Not mentioning both start and end times would clear all the data for the matched series in the database.
Note: that these endpoints mark the samples from the selected series as deleted, but they do not prevent the associated series metadata from still being returned in metadata queries for the affected time range. You can use the CleanTombstones
endpoint to remove the deleted data from disk and clean up the existing tombstones.
$ curl -X POST http://localhost:9090/api/v1/admin/tsdb/clean_tombstones
Now we have labels that define how long we need the metrics and the request which can remove them based on labels. At the end you have a script like this:
#!/bin/sh
#calculate the end timestamp and start timestamp will be the minimum possible time
curl -X PUT -g 'http://localhost:9090/api/v1/admin/tsdb/delete_series?match[]={retention_time="one-month"}&end='"$(date +%s -d '1 month ago')"''
curl -X PUT -g 'http://localhost:9090/api/v1/admin/tsdb/delete_series?match[]={retention_time="three-month"}&end='"$(date +%s -d '3 month ago')"''
curl -X PUT -g 'http://localhost:9090/api/v1/admin/tsdb/delete_series?match[]={retention_time="twelve-month"}&end='"$(date +%s -d '12 months ago')"''
#clean_storage
curl -X 'PUT' 'http://127.0.0.1:9090/api/v1/admin/tsdb/clean_tombstones' -H 'accept: */*'%
Don't forget about the automating to ensure that old metrics are regularly deleted without any manual intervention.
Posted on March 28, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 29, 2024