Re: Deadlock detected after pg_repack receives SIGINT

From: Jiří Hlinka <jiri(dot)hlinka(at)gmail(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Deadlock detected after pg_repack receives SIGINT
Date: 2015-11-05 06:14:17
Message-ID: CADCCO5r9xLLQ3h6Rp2bk6hHN2k9nz5Btn3yK=HzG9iHg0UWfsQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I'll check if the trigger activity was moving on or waiting for a lock from
logs.

I'm doing simple UPDATES, INSERTs and DELETEs on this table, but frequency
of these DMLs is _very_ high (it is a queue table used for a one-way
selective [just part of data are replicated] replication of queries between
two instances of the database, lets say from the main DB to data warehouse
DB, therefore part of DML queries on tables of the main table is "copied"
by a trigger to this queue table which is than used as a source for
replicating changes into data warehouse DB) - this is the reason why the
table needs VACUUM FULL at least twice a day, or better - running pg_repack
on it at least twice a day.

My opinion is, that pg_repack should guarantee a consistent, deadlock-free
behaviour via proper locking policy - if it is frequently updated table or
not and the "worst" case that could appear is, that the pg_repack will wait
for very long time (and this is the reason for the timeout usage in the
script - to stop pg_repack if it is unable to receive the lock it is
waiting for). I know it is hard to achieve this and not sure if it is
possible to achieve this "clean" behaviour, but if for example deadlock
happend, it is possible to fix it as a bug to make it even better than it
is now.

Maybe I could lower the impact of the problem by checking inside the script
if, after timeout expires, the pg_repack is still waiting in "Waiting for X
transactions to finished" (= safe to SIGINT pg_repack) or if it is already
in the process of repacking the table - do not stop it here. But there
still remain the fact, that there may be a bug which I could and sholud
report to the developers.

Thanks Michael,
Jiri

On Thu, Nov 5, 2015 at 2:57 AM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:

>
>
> On Wed, Nov 4, 2015 at 10:16 PM, Jiří Hlinka <jiri(dot)hlinka(at)gmail(dot)com>
> wrote:
>
>> I'm on pg_repack 1.3.2 (latest sable, no devel version available to check
>> if it is already fixed).
>>
>> Michael: your memories are fresh and clear :-), yes, it is part of a
>> cleanup rollback. The problem is, that the pgrepack_drop call this
>> statement:
>> DROP TABLE IF EXISTS repack.log_%u CASCADE
>>
>> ...and it collides with this trigger inserting new touples inside the [
>> frequently_updated_table] in the repack.log_[OID of
>> frequently_updated_table] routine:
>>
>> SELECT 'CREATE TRIGGER z_repack_trigger'
>> ' BEFORE INSERT OR DELETE OR UPDATE ON ' || repack.oid2text($1) || ' FOR
>> EACH ROW EXECUTE PROCEDURE repack.repack_trigger(' || '''INSERT INTO
>> repack.log_' || $1 || '(pk, row) VALUES(' || ' CASE WHEN $1 IS NULL THEN
>> NULL ELSE (ROW($1.' || repack.get_index_columns($2, ', $1.') ||
>> ')::repack.pk_' || $1 || ') END, $2)'')';
>>
>> As these two actions are both run by pg_repack, it seems like a bug to me
>> as pg_repack should not be able to deadlock itself, but not 100% sure about
>> this part...
>>
>
> Is the trigger activity moving on or is this one waiting as well for a
> lock? It sounds like pg_repack is waiting for the end of the transaction
> running this trigger to finish before being able to drop the trigger and
> this relation safely. I guess that you are running large UPDATE queries,
> right? It really looks like you should let more room to pg_repack to do its
> stuff.
> --
> Michael
>

--
Bc. Jiří Hlinka
Tel.: 725 315 263

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Victor Blomqvist 2015-11-05 07:19:37 Re: Drop or alter column under load give ERROR #42804 structure of query does not match function result type:
Previous Message Michael Paquier 2015-11-05 04:05:19 Re: pg_archivecleanup not deleting anything?