How to Use PostgreSQL Partitions With Django
Partitions in PostgreSQL(and MySQL too :)) are very useful when we have to store large data dividing them by attribute for future works. In my case I have logs of working on projects, which are stored in partition for each project.
Using partitions in django project is not very difficult, but there is one issue with “returning id” for insert statement in database. Function, which will insert a record to right table, must return trigger, but not id of last record as django expects.
Let’s make a support of partitions step by step:
step 1: create models
class Project(models.Model):
title = models.CharField(max_length=150)
class ProjectLog(models.Model):
project = models.ForeignKey(Project)
user = models.ForeignKey(User)
content = models.CharField(max_length=500)
step 2: create migration to put partition logic to database. We will create two functions: one for creating separate table for each project and second for inserting record to right partition table
def make_partitions_operants(apps, schema_editor):
Project = apps.get_model('project', 'Project')
ProjectLog = apps.get_model('project', 'ProjectLog')
sql = '''
create or replace function insert_into_projectlog() returns trigger as $$
begin
execute 'insert into %(projectlog_db_table)s_' || NEW.project_id || ' values(($1).*)' using NEW;
return NULL;
end $$ language 'plpgsql';
create or replace function create_projectlog_partition() returns trigger as $$
begin
execute 'create table %(projectlog_db_table)s_' || NEW.id || ' (
check (project_id = ' || NEW.id || ')
) inherits (%(projectlog_db_table)s);';
return NEW;
end $$ language 'plpgsql';
create trigger trigger_create_projectlog_partition
after insert on %(project_db_table)s
for each row execute procedure create_projectlog_partition();
create trigger trigger_insert_into_projectlog
before insert on %(projectlog_db_table)s
for each row execute procedure insert_into_projectlog();
''' % {
'projectlog_db_table': ProjectLog._meta.db_table,
'project_db_table': Project._meta.db_table,
'rowtype': '%%ROWTYPE',
}
schema_editor.execute(sql)
class Migration(migrations.Migration):
dependencies = [
('project', '0001_initial'),
]
operations = [
migrations.RunPython(make_partitions_operants),
]
I have used native Django 1.7 migrations
step 3: we need to overwrite save method of ProjectLog
model
def save(self, *args, **kwargs):
from django.db.backends.postgresql_psycopg2.base import DatabaseFeatures
if not self.pk:
DatabaseFeatures.can_return_id_from_insert = False
super(ProjectLog, self).save(*args, **kwargs)
DatabaseFeatures.can_return_id_from_insert = True
Here we overwrite can_return_id_from_insert
to prevent using “returning id” statement in query to database.
And that’s all what we need.
Each time when we create a new project, a new partition for ProjectLog
table will be created in database.
Example:
test=# \dt
List of relations
Schema | Name | Type | Owner
--------+--------------------+-------+-------
public | project_project | table | quard
public | project_projectlog | table | quard
(2 rows)
test=# INSERT INTO project_project(title) VALUES('blog');
INSERT 0 1
test=# \dt
List of relations
Schema | Name | Type | Owner
--------+----------------------+-------+----------------
public | project_project | table | quard
public | project_projectlog | table | quard
public | project_projectlog_1 | table | vadymzakovinko
(3 rows)
test=# INSERT INTO project_project(title) VALUES('education');
INSERT 0 1
test=# \dt
List of relations
Schema | Name | Type | Owner
--------+----------------------+-------+----------------
public | project_project | table | quard
public | project_projectlog | table | quard
public | project_projectlog_1 | table | vadymzakovinko
public | project_projectlog_2 | table | vadymzakovinko
(4 rows)
test=# INSERT INTO project_projectlog(user_id, project_id, content) VALUES(1, 1, 'new article');
INSERT 0 0
test=# SELECT * FROM project_projectlog;
id | user_id | project_id | content
----+---------+------------+-------------
1 | 1 | 1 | new article
(1 row)
test=# SELECT * FROM project_projectlog_1;
id | user_id | project_id | content
----+---------+------------+-------------
1 | 1 | 1 | new article
(1 row)
test=# SELECT * FROM project_projectlog_2;
id | user_id | project_id | content
----+---------+------------+---------
(0 rows)
test=# INSERT INTO project_projectlog(user_id, project_id, content) VALUES(1, 2, 'django 1.7 migrations');
INSERT 0 0
test=# SELECT * FROM project_projectlog_2;
id | user_id | project_id | content
----+---------+------------+-----------------------
2 | 1 | 2 | django 1.7 migrations
(1 row)
test=# SELECT * FROM project_projectlog_1;
id | user_id | project_id | content
----+---------+------------+-------------
1 | 1 | 1 | new article
(1 row)
test=# SELECT * FROM project_projectlog;
id | user_id | project_id | content
----+---------+------------+-----------------------
1 | 1 | 1 | new article
2 | 1 | 2 | django 1.7 migrations
(2 rows)