Re: Avoiding deadlocks when performing bulk update and delete operations

From: Bill Moran <wmoran(at)potentialtech(dot)com>
To: Sanjaya Vithanagama <svithanagama(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Avoiding deadlocks when performing bulk update and delete operations
Date: 2014-11-24 17:42:15
Message-ID: 20141124124215.3dddf50028c3d4a24a3f12b4@potentialtech.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, 24 Nov 2014 14:51:42 +1100
Sanjaya Vithanagama <svithanagama(at)gmail(dot)com> wrote:

> Hi All,
>
> We have a single table which does not have any foreign key references.
>
> id_A (bigint)
> id_B (bigint)
> val_1 (varchar)
> val_2 (varchar)
>
> The primary key of the table is a composite of id_A and id_B.
>
> Reads and writes of this table are highly concurrent and the table has
> millions of rows. We have several stored procedures which do mass updates
> and deletes. Those stored procedures are being called concurrently mainly
> by triggers and application code.
>
> The operations usually look like the following where it could match
> thousands of records to update or delete:
>
> DELETE FROM table_name t
> USING (
> SELECT id_A, id_B
> FROM table_name
> WHERE id_A = ANY(array_of_id_A)
> AND id_B = ANY(array_of_id_B)
> ORDER BY id_A, id_B
> FOR UPDATE
> ) del
> WHERE t.id_A = del.id_A
> AND t.id_B = del.id_B;
>
>
> UPDATE table_name t
> SET val_1 = 'some value'
> , val_2 = 'some value'
> FROM (
> SELECT id_A, id_B
> FROM table_name
> WHERE id_A = ANY(array_of_id_A)
> AND id_B = ANY(array_of_id_B)
> ORDER BY id_A, id_B
> FOR UPDATE
> ) upd
> WHERE t.id_A = upd.id_A
> AND t.id_B = upd.id_B;
>
> We are experiencing deadlocks and all our attempts to perform operations
> with locks (row level using SELECT FOR UPDATE as used in the above queries
> and table level locks) do not seem to solve these deadlock issues. (Note
> that we cannot in any way use access exclusive locking on this table
> because of the performance impact)
>
> Is there another way that we could try to solve these deadlock situations?
> The reference manual says ? "The best defense against deadlocks is
> generally to avoid them by being certain that all applications using a
> database acquire locks on multiple objects in a consistent order."
>
> Is there a guaranteed way to do bulk update/delete operations in a
> particular order so that we can ensure deadlocks won't occur? Or are there
> any other tricks to avoid deadlocks in this situation?

Lots of stuff to say about this ...

First off, Igor has a valid point that the subselects are not helping any
and may be making the situation slightly worse. I can't see any reason not
to simiplify the queries as he suggested.

Secondly, a lot of your comments are too vague for me to understand what's
happening, so I'm going to ask a bunch of questions to clarify:

* How many UPDATE/INSERT queries are you running per second?
* How many in parallel on average?
* What's the typical execution time for an UPDATE/INSERT that might cause
this problem?
* How frequently do deadlocks occur?
* Are there other tables involved in the transactions ... i.e., have you
confirmed that these are the _only_ tables causing the deadlock?

Since you didn't include any log output, I'm fuzzy on some of those things
above ... but I'm assuming that you're unable to post specific details of
the precise problem.

I have a lot of suggestions, but instead of bombing you with all of them, I
think it would be better if you answered those questions, which will tell
me which suggestions are most likely to help.

--
Bill Moran
I need your help to succeed:
http://gamesbybill.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Deepti Sharma S 2014-11-25 08:39:47 FW: Latest Postgresql DB version
Previous Message Berend Tober 2014-11-24 17:33:31 Re: How to individually list the DDL for all individual data base objects