# Postgresql – Deadlock with multi-row INSERTs despite ON CONFLICT DO NOTHING

## Setup

I have a bulk insert function set_interactions(arg_rows text) that looks like this:

with inserts as (
insert into interaction (
thing_id,
associate_id, created_time)
select t->>'thing_id', t->>'associate_id', now() from
json_array_elements(arg_rows::json)  t
ON CONFLICT (thing_id, associate_id) DO NOTHING
RETURNING thing_id, associate_id
) select into insert_count count(*) from inserts;

-- Followed by an insert in an unrelated table that has two triggers, neither of which touch any of the tables here (also not by any of their triggers, etc.)


(I wrap it that way because I need to get a count of actual inserts, without the "fake row updates" trick.)

The table interaction has:

1. Just one constraint: A multi-column primary key (thing_id, associate_id)
2. No indices
3. Just one trigger: After insert, for each row.

The trigger does this:

DECLARE associateId text;

BEGIN

-- Go out and get the associate_id for this thing_id
BEGIN
SELECT thing.associate_id INTO STRICT associateId FROM thing WHERE thing.id = NEW.thing_id;
EXCEPTION
WHEN NO_DATA_FOUND THEN
RAISE EXCEPTION 'Could not map the thing to an associate!';
WHEN TOO_MANY_ROWS THEN
RAISE EXCEPTION 'Could not map the thing to a SINGLE associate!'; -- thing PK should prevent this
END;

-- We don't want to add an association between an associate interacting with their own things
IF associateId != NEW.associate_id THEN

-- Insert the new association, if it doesn't yet exist
INSERT INTO associations ("thing_owner", "associate")
VALUES (associateId, NEW.associate_id)
ON CONFLICT DO NOTHING;

END IF;

RETURN NULL;

END;


Both interactions and associations have no more columns than you see in the above statements.

## Problem

Occasionally, I get an error deadlock detected from PostgreSQL 9.6.5 when the application calls set_interactions(). It may call it with 1-100 rows' worth of data, unsorted; "conflicting" batches may or may not have identical input (at the whole batch level or for each conflicting row).

Error details:

deadlock detected
while inserting index tuple (37605,46) in relation "associations"
SQL statement INSERT INTO associations ("thing_owner", "associate")
VALUES (associateId, NEW.associate_id)
ON CONFLICT DO NOTHING;
PL/pgSQL function aud.addfriendship() line 19 at SQL statement
SQL statement "with inserts as (
insert into interaction (
thing_id,
associate_id, created_time)
select t->>'thing_id', t->>'associate_id', now() from
json_array_elements(arg_rows::json)  t
ON CONFLICT (thing_id, associate_id) DO NOTHING
RETURNING thing_id, associate_id
) select                  count(*) from inserts"
PL/pgSQL function setinteractions(text) line 7 at SQL statement
Process 31370 waits for ShareLock on transaction 111519214; blocked by process 31418.
Process 31418 waits for ShareLock on transaction 111519211; blocked by process 31370.


## What I've tried

I thought that maybe the function was sometimes being called with duplicate data in a single call. Not so: That instead results in a guaranteed error, ON CONFLICT DO UPDATE command cannot affect row a second time.

I'm not able to reproduce the deadlock even trying 1,000 calls to set_interactions() all at once with identical parameters, or even with identical pairs of rows having (different within the pair) thing_id and associate_id values but other values too, so they don't get optimized away somehow before hitting PostgreSQL (they also shouldn't get optimized away by the database, since the function is marked volatile.) This is from a single-threaded back end; but at the same time, the application itself only runs one such back end in production, where the deadlock is occurring. I've even tried running these 1,000 calls against a full copy of the production database, and even under load from a second back end, and additionally from pgAdmin via a very-long-running query that selects from interactions. They succeed without complaint.

https://rcoh.svbtle.com/postgres-unique-constraints-can-cause-deadlock mentions trying to avoid relying on a unique index (which is what the PK amounts to, as I understand it) when inserting duplicates. However, that was before ON CONFLICT DO UPDATE, which I thought would solve this issue.

How is this query deadlocking "randomly", and how can I fix it? (Also, why can't I reproduce it with the above method?)

The ON CONFLICT clause can prevent duplicate key errors. There can still be friction with concurrent transactions trying to enter the same keys or update the same rows. So it's no insurance against deadlocks.

Most importantly, add a consistent order to input rows with ORDER BY. To make sure the order is enforced I use a CTE, which materializes the result. (I think it should work with a subquery, too; just to be sure.) Else, mutually entangled inserts trying to enter identical index tuples in the unique index can lead to the deadlock you observed. The manual:

The best defense against deadlocks is generally to avoid them by being certain that all applications using a database acquire locks on multiple objects in a consistent order.

Also, since set_interactions() is a PL/pgSQL function this is simpler & cheaper:

WITH data AS (
SELECT t->>'thing_id' AS t_id, t->>'associate_id' AS a_id
-- Or, if not type text, cast right away:
-- SELECT (t->>'thing_id')::int AS t_id, (t->>'associate_id')::int AS a_id
FROM   json_array_elements(arg_rows::json) t
ORDER  BY 1, 2  -- deterministic, stable order (!!)
)
INSERT INTO interaction (thing_id, associate_id, created_time)
SELECT t_id, a_id, now()
FROM   data
ON     CONFLICT (thing_id, associate_id) DO NOTHING;

GET DIAGNOSTICS insert_count = ROW_COUNT;


No need for another CTE, RETURNING and another count(*). More:

The trigger function looks bloated, too. No need for a nested block, since you are not catching errors, only raising exceptions that roll back the whole transaction either way. And the exceptions are pointless, too.

• The 1st EXCEPTION on NO_DATA_FOUND can never occur in a proper many-to-many design with FK constraints enforcing referential integrity.

• The 2nd is pointless as well - you are suspecting as much:

-- PK should prevent this

The trigger function boils down to:

BEGIN
-- Insert the new association, if it doesn't yet exist
INSERT INTO associations (thing_owner, associate)
SELECT t.associate_id, NEW.associate_id
FROM   thing t
WHERE  t.id = NEW.thing_id          --     -- PK guarantees 0 or 1 result
AND    t.associate_id <> NEW.associate_id  -- exclude association to self
ON     CONFLICT DO NOTHING;

RETURN NULL;
END


You might remove the trigger and the function set_interactions() altogether and just run this query, doing everything useful I can see in the question:

WITH data AS (
SELECT (t->>'thing_id')::int AS t_id, (t->>'associate_id')::int AS a_id  -- asuming int
FROM   json_array_elements(arg_rows::json) t
ORDER  BY 1, 2  -- (!!)
)
, ins_inter AS (
INSERT INTO interaction (thing_id, associate_id, created_time)
SELECT t_id, a_id, now()
FROM   data
ON     CONFLICT (thing_id, associate_id) DO NOTHING
RETURNING thing_id, associate_id
)
, ins_ass AS (
INSERT INTO associations (thing_owner, associate)
SELECT t.associate_id, i.associate_id
FROM   ins_inter i
JOIN   thing     t ON t.id = i.thing_id
AND t.associate_id <> i.associate_id  -- exclude association to self
ON     CONFLICT DO NOTHING
)
SELECT count(*) FROM ins_inter;


Now, I don't see any more chance for deadlocks. Of course, all other transactions possibly also writing to the same table concurrently must stick to the same order of rows.

If that's not possible and you are still considering SKIP LOCKED, see: