Using the manager
django-postgres-extra
provides the psqlextra.manager.PostgresManager
which exposes a lot of functionality. Your model must use this manager in order to use most of this package's functionality.
There are four ways to do this:
- Inherit your model from
psqlextra.models.PostgresModel
:
from psqlextra.models import PostgresModel
class MyModel(PostgresModel):
myfield = models.CharField(max_length=255)
- Override default manager with
psqlextra.manager.PostgresManager
:
from django.db import models
from psqlextra.manager import PostgresManager
class MyModel(models.Model):
# override default django manager
objects = PostgresManager()
myfield = models.CharField(max_length=255)
- Provide
psqlextra.manager.PostgresManager
as a custom manager:
from django.db import models
from psqlextra.manager import PostgresManager
class MyModel(models.Model):
# custom mananger name
beer = PostgresManager()
myfield = models.CharField(max_length=255)
# use like this:
MyModel.beer.upsert(..)
# not like this:
MyModel.objects.upsert(..) # error!
-
Use the
psqlextra.util.postgres_manager
on the fly:This allows the manager to be used anywhere on any model, but only within the context. This is especially useful if you want to do upserts into Django's
ManyToManyField
's generatedthrough
table:
from django.db import models
from psqlextra.util import postgres_manager
class MyModel(models.Model):
myself = models.ManyToManyField('self')
# within the context, you can access psqlextra features
with postgres_manager(MyModel.myself.through) as manager:
manager.upsert(...)
Conflict handling
The PostgresManager
comes with full support for PostgreSQL's ON CONFLICT DO ...
. This is an extremely useful feature for doing concurrency safe inserts. Often, when you want to insert a row, you want to overwrite it already exists, or simply leave the existing data there. This would require a SELECT
first and then possibly a INSERT
. Within those two queries, another process might make a change to the row. The alternative of trying to insert, ignoring the error and then doing a UPDATE
is also not good. That would result in a a lot of write overhead (due to logging). Luckily, PostgreSQL offers ON CONFLICT DO ...
, which allows you to specify what PostgreSQL should do in case that row already exists.
django-postgres-extra
brings full support for PostgreSQL's ON CONFLICT DO ...
, allowing blazing fast and concurrency safe inserts:
from django.db import models
from psqlextra.models import PostgresModel
from psqlextra.query import ConflictAction
class MyModel(PostgresModel):
myfield = models.CharField(max_length=255, unique=True)
# insert or update if already exists, then fetch, all in a single query
obj2 = (
MyModel.objects
.on_conflict(['myfield'], ConflictAction.UPDATE)
.insert_and_get(myfield='beer')
)
# insert, or do nothing if it already exists, then fetch
obj1 = (
MyModel.objects
.on_conflict(['myfield'], ConflictAction.NOTHING)
.insert_and_get(myfield='beer')
)
# insert or update if already exists, then fetch only the primary key
id = (
MyModel.objects
.on_conflict(['myfield'], ConflictAction.UPDATE)
.insert(myfield='beer')
)
Constraint specification
The on_conflict
function's first parameter denotes the name of the column(s) in which the conflict might occur. Although you can specify multiple columns, these columns must somehow have a single constraint. For example, in case of a unique_together
constraint.
Multiple columns
Specifying multiple columns is necessary in case of a constraint that spans multiple columns, such as when using Django's unique_together:
from django.db import models
from psqlextra.models import PostgresModel
class MyModel(PostgresModel)
class Meta:
unique_together = ('first_name', 'last_name',)
first_name = models.CharField(max_length=255)
last_name = models.CharField(max_length=255)
obj = (
MyModel.objects
.on_conflict(['first_name', 'last_name'], ConflictAction.UPDATE)
.insert_and_get(first_name='Henk', last_name='Jansen')
)
HStore keys
Catching conflicts in columns with a UNIQUE
constraint on a hstore
key is also supported:
from django.db import models
from psqlextra.models import PostgresModel
from psqlextra.fields import HStoreField
class MyModel(PostgresModel)
name = HStoreField(uniqueness=['en'])
id = (
MyModel.objects
.on_conflict([('name', 'en')], ConflictAction.NOTHING)
.insert(name={'en': 'Swen'})
)
This also applies to "unique together" constraints in a hstore
field:
class MyModel(PostgresModel)
name = HStoreField(uniqueness=[('en', 'ar')])
id = (
MyModel.objects
.on_conflict([('name', 'en'), ('name', 'ar')], ConflictAction.NOTHING)
.insert(name={'en': 'Swen'})
)
insert vs insert_and_get
After specifying on_conflict
you can use either insert
or insert_and_get
to perform the insert.
insert
- Perform the insert, and then returns the primary key of the row that was inserted or it conflicted with.
insert_and_get
- Perform the insert, then returns the entire row that was inserted or it conflicted with, in the form of a model instance.
Pitfalls
The standard Django methods for inserting/updating are not affected by on_conflict
. It was a conscious decision to not override or change their behavior. The following completely ignores the on_conflict
:
obj = (
MyModel.objects
.on_conflict(['first_name', 'last_name'], ConflictAction.UPDATE)
.create(first_name='Henk', last_name='Jansen')
The same applies to methods such as update
, get_or_create
, update_or_create
etc.
Conflict actions
There's currently two actions that can be taken when encountering a conflict. The second parameter of on_conflict
allows you to specify that should happen.
ConflictAction.UPDATE
- If the row does not exist, insert a new one.
- If the row exists, update it.
This is also known as a "upsert".
ConflictAction.NOTHING
- If the row does not exist, insert a new one.
- If the row exists, do nothing.
This is preferable when the data you're about to insert is the same as the one that already exists. This is more performant because it avoids a write in case the row already exists.
Bulk
bulk_insert
allows your to use conflict resolution for bulk inserts:
from django.db import models
from psqlextra.models import PostgresModel
class MyModel(PostgresModel):
name = models.CharField(max_length=255, unique=True)
obj = (
MyModel.objects
.on_conflict(['name'], ConflictAction.UPDATE)
.bulk_insert([
dict(name='swen'),
dict(name='henk'),
dict(name='adela')
])
)
bulk_insert
uses a single query to insert all specified rows at once. It returns a list
of dict()
with each dict()
being a merge of the dict()
passed in along with any index returned from Postgres.
Limitations
In order to stick to the "everything in one query" principle, various, more advanced usages of bulk_insert
are impossible. It is not possible to have different rows specify different amounts of columns. The following example does not work:
from django.db import models
from psqlextra.models import PostgresModel
class MyModel(PostgresModel):
first_name = models.CharField(max_length=255, unique=True)
last_name = models.CharField(max_length=255, default='kooij')
obj = (
MyModel.objects
.on_conflict(['name'], ConflictAction.UPDATE)
.bulk_insert([
dict(name='swen'),
dict(name='henk', last_name='poepjes'), # invalid, different column configuration
dict(name='adela')
])
)
An exception is thrown if django-postgres-extra
detects this behavior.
Shorthand
The on_conflict
, insert
and insert_or_create
methods were only added in django-postgres-extra
1.6. Before that, only ConflictAction.UPDATE
was supported in the following form:
from django.db import models
from psqlextra.models import PostgresModel
class MyModel(PostgresModel):
myfield = models.CharField(max_length=255, unique=True)
obj = (
MyModel.objects
.upsert_and_get(
conflict_target=['myfield']
fields=dict(myfield='beer')
)
)
id = (
MyModel.objects
.upsert(
conflict_target=['myfield']
fields=dict(myfield='beer')
)
)
(
MyModel.objects
.bulk_upsert(
conflict_target=['myfield']
rows=[
dict(myfield='beer'),
dict(myfield='wine')
]
)
)
These two short hands still exist and are not deprecated. They behave exactly the same as ConflictAction.UPDATE
and are there for convenience. It is up to you to decide what to use.