From: | Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com> |
---|---|
To: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Get stuck when dropping a subscription during synchronizing table |
Date: | 2017-06-06 17:36:13 |
Message-ID: | fe072153-babd-3b5d-8052-73527a6eb657@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 03/06/17 18:53, Peter Eisentraut wrote:
> On 6/2/17 14:52, Peter Eisentraut wrote:
>> On 5/24/17 15:14, Petr Jelinek wrote:
>>> All the locking works just fine the way it is in master. The issue with
>>> deadlock with apply comes from the wrong handling of the SIGTERM in the
>>> apply (we didn't set InterruptPending). I changed the SIGTERM handler in
>>> patch 0001 just to die which is actually the correct behavior for apply
>>> workers. I also moved the connection cleanup code to the
>>> before_shmem_exit callback (similar to walreceiver) and now that part
>>> works correctly.
>>
>> I have committed this, in two separate parts. This should fix the
>> originally reported issue.
>>
>> I will continue to work through your other patches. I notice there is
>> still a bit of discussion about another patch, so please let me know if
>> there is anything else I should be looking for.
>
> I have committed the remaining two patches. I believe this fixes the
> originally reported issue.
>
So the fact that we moved workers to standard interrupt handling broke
launcher in subtle ways because it still uses it's own SIGTERM handling
but some function it calls are using CHECK_FOR_INTERRUPTS (they are used
by worker as well). I think we need to move launcher to standard
interrupt handling as well. It's not same as other processes though as
it's allowed to be terminated any time (just like autovacuum launcher)
so we just proc_exit(0) instead of FATALing out.
This is related to the nightjar failures btw.
As a side note, we are starting to have several IsSomeTypeOfProcess
functions for these kind of things. I wonder if bgworker infrastructure
should somehow provide this type of stuff (the proposed bgw_type might
help there) as well as maybe being able to register interrupt handler
for the worker (it's really hard to do it via custom SIGTERM handler as
visible on worker, launcher and walsender issues we are fixing).
Obviously this is PG11 related thought.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachment | Content-Type | Size |
---|---|---|
Use-standard-interrupt-handling-in-logical-replicati.patch | text/x-patch | 4.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Regina Obe | 2017-06-06 17:52:45 | Parallel Aggregation support for aggregate functions that use transitions not implemented for array_agg |
Previous Message | Tom Lane | 2017-06-06 17:33:38 | Re: Should we standardize on a type for signal handler flags? |