From: | Noah Misch <noah(at)leadboat(dot)com> |
---|---|
To: | Joe Conway <mail(at)joeconway(dot)com> |
Cc: | "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: ALTER TABLE deadlock with concurrent INSERT |
Date: | 2011-03-04 07:36:42 |
Message-ID: | 20110304073642.GA953@tornado.leadboat.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Mar 02, 2011 at 12:25:16PM -0800, Joe Conway wrote:
> I'm working with a client on an application upgrade script which
> executes a function to conditionally do an:
>
> ALTER TABLE foo ALTER COLUMN bar SET DATA TYPE baz
>
> If this is run while the application is concurrently doing inserts into
> foo, we are occasionally seeing deadlocks. Aside from the fact that they
> are better off not altering the table amid concurrent inserts, I'm
> trying to understand why this is even able to happen. I expect one to
> block the other, not a deadlock.
>
> This is 8.4.1 (I know, I know, I have advised strongly that they upgrade
> to 8.4.latest).
>
> We have not been able to repeat this forcibly. Here is what the log shows:
> ------------------------------
> 2011-02-25 14:38:07 PST [31686]: [1-1] ERROR: deadlock detected
> 2011-02-25 14:38:07 PST [31686]: [2-1] DETAIL: Process 31686 waits for
> AccessExclusiveLock on relation 16896 of database 16386; blocked by
> process 31634.
> Process 31634 waits for RowExclusiveLock on relation 16902 of
> database 16386; blocked by process 31686.
> Process 31686: SELECT change_column_type('attribute_summary',
> 'sequence_number', 'numeric');
> Process 31634: insert into attribute_summary (attribute_value,
> sequence_number, attribute_id) values ($1, $2, $3)
> 2011-02-25 14:38:07 PST [31686]: [3-1] HINT: See server log for query
> details.
> 2011-02-25 14:38:07 PST [31686]: [4-1] CONTEXT: SQL statement "ALTER
> TABLE attribute_summary ALTER COLUMN sequence_number SET DATA TYPE numeric"
> PL/pgSQL function "change_column_type" line 18 at EXECUTE statement
> 2011-02-25 14:38:07 PST [31686]: [5-1] STATEMENT: SELECT
> change_column_type('attribute_summary', 'sequence_number', 'numeric');
> ------------------------------
Does relation 16902 (attribute_summary) have a foreign key constraint over the
sequence_number column, in either direction, with relation 16896? That would
explain it:
session 1: ALTER TABLE attribute_summary ... <sleeps after relation_openrv in transformAlterTableStmt>
session 2: SELECT 1 FROM rel16896 LIMIT 0;
session 2: SELECT 1 FROM attribute_summary LIMIT 0; <blocks>
session 1: <wakes up; continues ALTER TABLE: deadlock upon locking rel16896>
Off the cuff, I think you could make sure this never deadlocks with a PL/pgSQL
recipe like this:
LOOP
BEGIN
LOCK TABLE rel16896;
LOCK TABLE attribute_summary NOWAIT;
EXIT;
EXCEPTION WHEN lock_not_available THEN
END;
END LOOP;
Granted, the cure may be worse than the disease.
nm
From | Date | Subject | |
---|---|---|---|
Next Message | Fujii Masao | 2011-03-04 07:42:11 | Re: Sync Rep v19 |
Previous Message | Magnus Hagander | 2011-03-04 07:03:14 | Re: why is max standby delay only 35 minutes? |