From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> |
Subject: | Re: Failure in subscription test 004_sync.pl |
Date: | 2021-06-12 12:57:02 |
Message-ID: | CAA4eK1KvjkHPSJ1uzfcVOvLBLcfsiz6gT78Jr25=YaQKKaGbZg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Jun 12, 2021 at 1:13 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> wrasse has just failed with what looks like a timing error with a
> replication slot drop:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=wrasse&dt=2021-06-12%2006%3A16%3A30
>
> Here is the error:
> error running SQL: 'psql:<stdin>:1: ERROR: could not drop replication
> slot "tap_sub" on publisher: ERROR: replication slot "tap_sub" is
> active for PID 1641'
>
> It seems to me that this just lacks a poll_query_until() doing some
> slot monitoring?
>
I think it is showing a race condition issue in the code. In
DropSubscription, we first stop the worker that is receiving the WAL,
and then in a separate connection with the publisher, it tries to drop
the slot which leads to this error. The reason is that walsender is
still active as we just wait for wal receiver (or apply worker) to
stop. Normally, as soon as the apply worker is stopped the walsender
detects it and exits but in this case, it took some time to exit, and
in the meantime, we tried to drop the slot which is still in use by
walsender.
If we want to fix this, we might want to wait till the slot is active
on the publisher before trying to drop it but not sure if it is a good
idea. In the worst case, if the user retries this operation (Drop
Subscription), it will succeed.
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Dunstan | 2021-06-12 13:05:57 | Re: Race condition in recovery? |
Previous Message | Zhihong Yu | 2021-06-12 12:55:13 | Re: Use pg_nextpower2_* in a few more places |