Re: Avoiding deadlocks when performing bulk update and delete operations

From: Igor Neyman <ineyman(at)perceptron(dot)com>
To: Sanjaya Vithanagama <svithanagama(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Avoiding deadlocks when performing bulk update and delete operations
Date: 2014-11-24 14:34:10
Message-ID: A76B25F2823E954C9E45E32FA49D70ECC21C9348@mail.corp.perceptron.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

From: pgsql-general-owner(at)postgresql(dot)org [mailto:pgsql-general-owner(at)postgresql(dot)org] On Behalf Of Sanjaya Vithanagama
Sent: Sunday, November 23, 2014 10:52 PM
To: pgsql-general(at)postgresql(dot)org
Subject: [GENERAL] Avoiding deadlocks when performing bulk update and delete operations

Hi All,

We have a single table which does not have any foreign key references.

id_A (bigint)
id_B (bigint)
val_1 (varchar)
val_2 (varchar)

The primary key of the table is a composite of id_A and id_B.

Reads and writes of this table are highly concurrent and the table has millions of rows. We have several stored procedures which do mass updates and deletes. Those stored procedures are being called concurrently mainly by triggers and application code.

The operations usually look like the following where it could match thousands of records to update or delete:

DELETE FROM table_name t
USING (
SELECT id_A, id_B
FROM table_name
WHERE id_A = ANY(array_of_id_A)
AND id_B = ANY(array_of_id_B)
ORDER BY id_A, id_B
FOR UPDATE
) del
WHERE t.id_A = del.id_A
AND t.id_B = del.id_B;

UPDATE table_name t
SET val_1 = 'some value'
, val_2 = 'some value'
FROM (
SELECT id_A, id_B
FROM table_name
WHERE id_A = ANY(array_of_id_A)
AND id_B = ANY(array_of_id_B)
ORDER BY id_A, id_B
FOR UPDATE
) upd
WHERE t.id_A = upd.id_A
AND t.id_B = upd.id_B;

We are experiencing deadlocks and all our attempts to perform operations with locks (row level using SELECT FOR UPDATE as used in the above queries and table level locks) do not seem to solve these deadlock issues. (Note that we cannot in any way use access exclusive locking on this table because of the performance impact)

Is there another way that we could try to solve these deadlock situations? The reference manual says — "The best defense against deadlocks is generally to avoid them by being certain that all applications using a database acquire locks on multiple objects in a consistent order."

Is there a guaranteed way to do bulk update/delete operations in a particular order so that we can ensure deadlocks won't occur? Or are there any other tricks to avoid deadlocks in this situation?

Thank you in advance,
Sanjaya

May be I’m missing something here, but it seems that you make the problem worse by using :

DELETE … USING (SELECT … FOR UPDATE)…

Can’t you just do:

DELETE FROM table_name
WHERE id_A = ANY(array_of_id_A)
AND id_B = ANY(array_of_id_B);

?

Regards,
Igor Neyman

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Berend Tober 2014-11-24 15:14:15 How to individually list the DDL for all individual data base objects
Previous Message Alexis 2014-11-24 07:04:16 How to avoid a GIN recheck condition