Time-based table partitioning using pg partman
Table partitioning is splitting of a large table in smaller tables based on values of some attribute in the table. This post is about partitioning tables using pg_partman. It provides the following two types of table partitioning,
- Time-based table partitioning
- Serial id based partitioning. Follow below sections to do time-based table partitioning. I will be using the postgres database to describe the example
Installing and setting up pg_partman
- run following commands to install pg_partman
git clone https://github.com/keithf4/pg_partman
cd pg_partman
make install
make NO_BGW=1 install
After running above commands successfully, pg_partman should be installed on your machine. running the command in step 4 will provide plain PL/PGSQL functions only. These functions can be used to implement partitioning logic.
- create schema and extension for partman by logging into the postgreSQL database. I am using PostgreSQL interactive terminal to run following commands
create database test-partman;
\c test-partman;
create schema partman;
create extension pg_partman schema partman;
Creating a database table
Create a table in the database test-partman by running the following query in PostgreSQL interactive terminal.
create table users(
id bigserial primary key,
first_name varchar not null,
last_name varchar,
email varchar not null,
created_at timestamp without time zone not null,
updated_at timestamp without time zone not null
);
Creating a partition set
A partition set includes a parent table and inherited child tables. To create a partition set _create_parent()_ is used. During running this function an ACCESS EXCLUSIVE lock is taken on the parent table. On running following function a record is added in the part_config table under the partman schema.
Let me demystify arguments passed to the above function.
- p_parent_table is the name of the parent table, here users table is under the public schema. This argument is of type text.
- p_control is the name of the column in the parent table. Partitioning is based on this column. As partman provides time/serial id based partitioning so data type of this column in the table must of time or integer type. This argument is of type text.
- p_type is about the type of partitioning. Type of partitioning can of native or partman type. In this example, I have used the partman type. This argument is of type text.
- p_interval is the time interval of each partition. In this example, I used ‘daily’ interval. so the child table will be created for each date. This argument is of type text.
- p_automatic_maintenance is used to set the automatic maintenance when run_maintenance() is called. Maintenance is about the creation of new child tables and deleting older child tables based on retention period config. This argument is of type text.
- p_debug turns on the printing the additional debugging information in the console. This argument is of type boolean
Setting the retention period
To set the retention policy it is required to update the retention column in partman.part_config table for the record created as a result of the above function. run following query to set the retention policy for child tables.
UPDATE partman.part_config
SET retention = '5 days', retention_keep_table=false
WHERE parent_table='public.users';
Running maintenance
run_maintenance() function is used to run the maintenance on all the tables for which _automaticmaintenance is set to ‘on’ or on the specific table whose name is passed as an argument to this function. Following function is used to run maintenance on public.users table.
select partman.run_maintenance(
p_parent_table := 'public.users',
p_debug := TRUE);
After following above sections successfully you should have implemented time-based partitioning. If you have any suggestions please let me know.