Data Retention

A data retention policy aims at removing data older than a certain duration from the database.

Example

Below an example of a retention policy which applies on all hypertables. It is possible to customize how it works by changing the format function according to your data or add filter using WHERE clause in the PERFORM query.

CREATE OR REPLACE PROCEDURE generic_retention (config jsonb)
LANGUAGE PLPGSQL
AS $$
DECLARE
  drop_after interval;
	schema varchar;
	name varchar;
BEGIN
  SELECT jsonb_object_field_text (config, 'drop_after')::interval INTO STRICT drop_after;

  IF drop_after IS NULL THEN
    RAISE EXCEPTION 'Config must have drop_after';
  END IF;

  -- You can modify the following query to add a more precise retention policy.
	FOR schema, name IN SELECT hypertable_schema, hypertable_name FROM timescaledb_information.hypertables
	LOOP
		RAISE NOTICE '%', format('%I.%I', schema, name);
		PERFORM drop_chunks(format('%I.%I', schema, name), older_than => drop_after);
		COMMIT;
	END LOOP;
END
$$;

Execute Procedure

You can schedule the execution of the previous procedure by using the Scalingo scheduler. To do so, add a cron.json file at the root of your application’s source code, containing the code below:

{
  "jobs": [
    {
      "command": "0 0 * * * dbclient-fetcher psql && psql $SCALINGO_POSTGRESQL_URL -f data_retention_procedure.sql",
      "size": "S"
    }
  ]
}

The scheduled task will be executed every day at midnight. And execute a psql command that invokes a SQL script called data_retention_procedure.sql.

Create this SQL script file at the root of your application’s source code containing the code below:

call generic_retention(config => '{"drop_after":"12 month"}');

This script invokes the created procedure generic_retention to remove the data older than twelve months.


Suggest edits

Data Retention

©2023 Scalingo