Re: Duplicated LSN in ReorderBuffer

From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Ildar Musin <ildar(at)adjust(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Duplicated LSN in ReorderBuffer
Date: 2019-07-28 23:09:52
Message-ID: 6ba31f29-5f93-5abb-55fa-350667b452b7@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 27. 07. 19 3:15, Andres Freund wrote:
> Hi,
>
> Petr, Simon, see the potential issue related to fast forward at the
> bottom.
>
> [..snip..]
>
> This actually made me look at the nearby changes due to
>
> commit 9c7d06d60680c7f00d931233873dee81fdb311c6
> Author: Simon Riggs <simon(at)2ndQuadrant(dot)com>
> Date: 2018-01-17 11:38:34 +0000
>
> Ability to advance replication slots
>
> and uhm, I'm not sure they're fully baked. Something like:
>
> /*
> * If we don't have snapshot or we are just fast-forwarding, there is no
> * point in decoding changes.
> */
> if (SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT ||
> ctx->fast_forward)
> return;
>
> case XLOG_HEAP2_MULTI_INSERT:
> if (!ctx->fast_forward &&
> SnapBuildProcessChange(builder, xid, buf->origptr))
> DecodeMultiInsert(ctx, buf);
> break;
>
> is not very suggestive of that (note the second check).
>

You mean that it's redundant, yeah.., although given your next point,
see bellow.

>
> And related to the point of the theorizing above, I don't think skipping
> XLOG_HEAP2_NEW_CID entirely when forwarding is correct. As a NEW_CID
> record does not imply an invalidation message as discussed above, we'll
> afaict compute wrong snapshots when such transactions are encountered
> during forwarding. And we'll then log those snapshots to disk. Which
> then means the slot cannot safely be used for actual decoding anymore -
> as we'll use that snapshot when starting up decoding without fast
> forwarding.
>

Hmm, I guess that's true. I think I have convinced myself that CID does
not matter outside of this transaction, but since we might actually
share the computed snapshot via file save/restore with other slots, any
non-fast-forwarding decoding that reads the same transaction could miss
the CID thanks to the shared snapshot which does not include it.

Given that we don't process any other records in this function besides
XLOG_HEAP2_MULTI_INSERT and XLOG_HEAP2_NEW_CID, it seems like simplest
fix is to just remove the first check for fast forward and keep the one
in XLOG_HEAP2_MULTI_INSERT.

--
Petr Jelinek
2ndQuadrant - PostgreSQL Solutions for the Enterprise
https://www.2ndQuadrant.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2019-07-29 01:37:28 Re: POC: Cleaning up orphaned files using undo logs
Previous Message Thomas Munro 2019-07-28 22:59:02 Re: Commitfest 2019-07, the first of five* for PostgreSQL 13