From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | depesz(at)depesz(dot)com, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: WAL segments removed from primary despite the fact that logical replication slot needs it. |
Date: | 2023-02-07 08:20:26 |
Message-ID: | CAD21AoBVhYnGBuW_o=wEGgTp01qiHNAx1a14b1X9kFXmuBe=sg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Tue, Feb 7, 2023 at 12:22 AM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>
> On 2023-Feb-06, Masahiko Sawada wrote:
>
> > I've also confirmed that this issue is fixed by the attached patch,
> > which clears candidate_restart_lsn and friends during
> > ReplicationSlotRelease().
>
> Hmm, interesting -- I was studying some other bug recently involving the
> xmin of a slot that had been invalidated and I remember wondering if
> these "candidate" fields were being properly ignored when the slot is
> marked not in use; but I didn't check. Are you sure that resetting them
> when the slot is released is the appropriate thing to do? I mean,
> shouldn't they be kept set while the slot is in use, and only reset if
> we destroy it?
Thinking about the root cause more, it seems to me that the root cause
is not the fact that candidate_xxx values are not cleared when being
released.
In the scenario I reproduced, after restarting the logical decoding,
the walsender sets the restart_lsn to a candidate_restart_lsn left in
the slot upon receiving the ack from the subscriber. Then, when
decoding a RUNNING_XACTS record, the walsender updates the
restart_lsn, which actually retreats it. The patch, which clears the
candidate_xxx values at releasing, fixes this issue by not updating
the restart_lsn upon receiving the ack in this case, but I think it's
legitimate that the walsender sets the restart_lsn to the
candidate_restart_lsn left in the slot in this case. The root cause
here seems to me that it allows the restart_lsn to retreat. In
LogicalIncreaseRestartDecodingForSlot(), we have the following codes:
/* don't overwrite if have a newer restart lsn */
if (restart_lsn <= slot->data.restart_lsn)
{
}
/*
* We might have already flushed far enough to directly accept this lsn,
* in this case there is no need to check for existing candidate LSNs
*/
else if (current_lsn <= slot->data.confirmed_flush)
{
slot->candidate_restart_valid = current_lsn;
slot->candidate_restart_lsn = restart_lsn;
/* our candidate can directly be used */
updated_lsn = true;
}
/*
* Only increase if the previous values have been applied, otherwise we
* might never end up updating if the receiver acks too slowly. A missed
* value here will just cause some extra effort after reconnecting.
*/
if (slot->candidate_restart_valid == InvalidXLogRecPtr)
{
slot->candidate_restart_valid = current_lsn;
slot->candidate_restart_lsn = restart_lsn;
SpinLockRelease(&slot->mutex);
}
If slot->candidate_restart_valid is InvalidXLogRecPtr, he
candidate_restart_lsn could be set to a LSN lower than the slot's
restart_lsn in spite of the first comment saying "don't overwrite if
have a newer restart lsn". So I think the right fix here is to not
allow the restart_lsn to retreat by changing "if" to "else if" of the
third block.
>
> (But, actually, in that case, maybe we don't need to reset them: instead
> we need to make sure to ignore the values for slots that are not in_use.
> However, I don't know what the semantics are for these "candidate"
> fields, so maybe this is wrong.)
IIUC in that case, the WAL segments are removed while the slot is in_use.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | PG Bug reporting form | 2023-02-07 13:16:08 | BUG #17781: Assert in setrefs.c |
Previous Message | Qu, Mischa, Majorel China | 2023-02-07 04:05:13 | 答复: exceptional result of postres_fdw external table joining local table |