From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Lakshmi Narayanan Sreethar <lakshmi(at)timescale(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: DROP DATABASE deadlocks with logical replication worker in PG 15.1 |
Date: | 2023-01-17 20:04:32 |
Message-ID: | 20230117200432.xaoenn7ni7srb2l2@awork3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Hi,
On 2023-01-17 06:23:45 +0530, Amit Kapila wrote:
> As per my initial analysis, I have added this code to hold/resume
> interrupts during slot creation due to the test failure (in buildfarm)
> reported in the email [1]. It is clearly a wrong fix as per the report
> and discussion in this thread.
Yea. You really can never hold interrupts across some thing that could
indefinitely be blocked. A HOLD_INTERRUPTS() while doing error recovery (as in
DisableSubscriptionAndExit()) is fine, that's basically a finite amount of
work. But doing so while issuing SQL commands to another node, or anything
else that could just block indefinitely, isn't.
> There is an analysis of the test
> failure in the email [2] which explains the race condition that leads
> to test failure. Thinking again about the failure, I feel we can
> instead change the failed test (t/004_sync.pl) to either ensure that
> both the walsenders (corresponding to sync worker and apply worker)
> exits after dropping the subscription and before checking the
> remaining slots on publisher or wait for slots to become zero in the
> test.
How about waiting for the table to start to be synced (and thus the slot to be
created) before issuing the drop subscription? If the slot hadn't yet been
created, the test doesn't prove that we successfully clean up...
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Masahiko Sawada | 2023-01-18 01:40:39 | Re: BUG #17741: vacuum process hangs after pg_surgery manipulations |
Previous Message | David G. Johnston | 2023-01-17 15:07:31 | Re: Possible wrong result with some "in" subquery with non-existing columns |