Re: Get stuck when dropping a subscription during synchronizing table

From: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject: Re: Get stuck when dropping a subscription during synchronizing table
Date: 2017-05-24 19:14:31
Message-ID: 6a89c0dd-2380-5663-7018-2888d3b707d4@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I finally had time to properly analyze this, and turns out we've been
all just trying to fix symptoms and the actual problems.

All the locking works just fine the way it is in master. The issue with
deadlock with apply comes from the wrong handling of the SIGTERM in the
apply (we didn't set InterruptPending). I changed the SIGTERM handler in
patch 0001 just to die which is actually the correct behavior for apply
workers. I also moved the connection cleanup code to the
before_shmem_exit callback (similar to walreceiver) and now that part
works correctly.

The issue with orphaned sync workers is actually two separate issues.
First, due to thinko we always searched for sync worker in
wait_for_sync_status_change instead of searching for opposite worker as
was the intention (i.e. sync worker should search for apply and apply
should search for sync). Thats fixed by 0002. And second, we didn't
accept any invalidation messages until the whole sync process finished
(because it flattens all the remote transactions in the single one) so
sync worker didn't learn about subscription changes/drop until it has
finished, which I now fixed in 0003.

There is still outstanding issue that sync worker will keep running
inside the long COPY because the invalidation messages are also not
processed until it finishes but all the original issues reported here
disappear for me with the attached patches applied.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0003-Receive-invalidation-messages-correctly-in-tablesync.patch binary/octet-stream 2.3 KB
0002-Make-tablesync-worker-exit-when-apply-dies-while-it-.patch binary/octet-stream 2.2 KB
0001-Fix-signal-handling-in-logical-workers.patch binary/octet-stream 12.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Petr Jelinek 2017-05-24 19:21:20 Re: ALTER SUBSCRIPTION ..SET PUBLICATION <no name> refresh is not throwing error.
Previous Message Euler Taveira 2017-05-24 19:07:34 Re: ALTER SUBSCRIPTION ..SET PUBLICATION <no name> refresh is not throwing error.