From: | Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com> |
---|---|
To: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
Cc: | pgsql-bugs <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Re: BUG #13657: Some kind of undetected deadlock between query and "startup process" on replica. |
Date: | 2015-10-03 14:48:52 |
Message-ID: | CAK-MWwRJ7APazAfs86DEBm0AinB3Lb-EXBAzZZGQUq6wHL9+Ow@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Fri, Oct 2, 2015 at 6:52 PM, Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com> wrote:
>
>
> On Fri, Oct 2, 2015 at 4:58 PM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com
> > wrote:
>
>>
>>
>> On Fri, Oct 2, 2015 at 2:14 PM, Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com>
>> wrote:
>>
>>> >
>>> This backtrace is not indicating that this process is waiting on a
>>> relation lock, it is resolving a recovery conflict while removing tuples,
>>> killing the virtual transaction depending on if max_standby_streaming_delay
>>> or max_standby_archive_delay are set if the conflict gets longer. Did you
>>> change the default of those parameters, which is 30s, to -1? This would
>>> mean that the standby waits indefinitely.
>>>
>>>
>>> Problem that startup process have confict with a query, which blocked
>>> (waiting for) on the startup process itself (query could not process
>>> because it waiting for lock which held by startup process, and startup
>>> process waiting for finishing this query). So it's an undetected deadlock
>>> condtion here (as I understand situation).
>>>
>>> PS: there are no other activity on the database during that problem
>>> except blocked query.
>>>
>>
>> Don't you have other queries running in parallel of the one you are
>> defining as "stuck" on the standby that prevent replay to move on? Like a
>> long-running transaction working on the relation involved? Are you sure
>> that you did not set up
>>
>> max_standby_streaming_delay to -1?
>> --
>> Michael
>>
>
> During the problem period on the database had runned only one query
> (listed in intial report) and nothing more (and this query had beed in
> waiting state according to pg_stat_activity).
> The pg_locks show that the query waiting for AccessShareLock on relation
> 17987, in the same time the startup process have AccessExclusiveLock on the
> same relation and waiting for something. No other activity on the replica
> going on.
> And yes, the max_standby_streaming_delay to -1, as a result the
> replication process had been stuck on query from external monitoring tool
> forever until I killed that query, but situation repeated in few hours
> again.
>
>
One last addition to the report. The same database replica (server), the
same monitoring queries, all the same but using PostgreSQL 9.2.13 doesn't
have that problem.
It been used for a few years without such issues, problem only arise only
after whole cluster upgrade to version 9.4.4.
> --
> Maxim Boguk
>
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2015-10-03 19:26:50 | Re: BUG #13657: Some kind of undetected deadlock between query and "startup process" on replica. |
Previous Message | Tom Lane | 2015-10-03 14:28:52 | Re: BUG #13633: ERROR: invalid memory alloc request size |