Re: Possible bug in logical replication.

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, Arseny Sher <a(dot)sher(at)postgrespro(dot)ru>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, konstantin knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Possible bug in logical replication.
Date: 2018-07-19 02:45:27
Message-ID: 20180719024527.spsjqg3ttudpfv6s@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2018-Jul-19, Michael Paquier wrote:

> On Wed, Jul 18, 2018 at 02:30:53PM -0400, Alvaro Herrera wrote:
> > In the immortal words of Julian Bream: "yeah, I didn't like any of
> > that".
>
> One wikipedia lookup later, I still don't know where this quote comes
> from, but at least I understand who the man is.

https://twitter.com/alvherre/status/1019652397306703873

OK, maybe not that "immortal" after all.

> I may be missing something, but I cannot apply your patch on HEAD so I
> have not tested it. Anyway, I read through it and the thing does not
> look logically wrong.

Sorry, I forgot to mention this -- it applies to 11.

> > I also moved some assignments from the declaration section to the code
> > section, so that I could attach proper comments to each, to improve
> > clarity of *why* we do those things.
>
> To be pedantic here, you could move the declarations of startlsn,
> old_resowner and ctx directly inside the PG_TRY block.

Good idea, thanks.

> > I then noticed that we get a XLogRecord from XLogReadRecord, but then
> > fail to do anything with it, so I changed the code to use a bool
> > instead, which I think is clearer.
>
> Matter of taste perhaps, I was fine with just manipulating the record
> pointer.

Yeah, it works out to the same thing really. Maybe it's just me being
pedantic and annoyed when I realized that logical decoding does not
operate on the record itself but on the xlogreader struct instead.

TBH what I was actually doing at first was just casting the result of
XLogReadRecord to void and not doing anything with it, until I realized
that the logical decoding call below was important because of
side-effects. I thought since fast_forward mode was used,
XLogReadRecord would not return anything anyway -- but I added an assert
and realized that was not so.

> > I think the proposed comment before the LogicalDecodingProcessRecord
> > call failed to convey the important ideas, so I rewrote that one also.
> >
> > There is no struct member called confirmed_flush_lsn anywhere.
>
> This is referring to the system catalog field in pg_replication_slots.

Yeah, I think that's a bit misleading. (I very frequently do a tag-jump
on identifiers in comments, and it's uncomfortable that in this case it
jumps to the Docbook source rather than to the struct declaration.)

> > PG_TRY();
> > {
> > - /* restart at slot's confirmed_flush */
> > + /*
> > + * Create our decoding context in fast_forward mode, passing start_lsn
> > + * as Invalid, so that we start processing from confirmed_flush.
> > + */
>
> I'd rather mention InvalidXLogRecPtr directly here. Invalid alone makes
> no real sense.

OK. I was of two minds about that.

> > + gotrecord = XLogReadRecord(ctx->reader, startlsn, &errm) != NULL;
>
> I would put parenthesis for clarity.

Or just put it back as a record pointer.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-07-19 03:14:36 Re: More consistency for some file-related error message
Previous Message Michael Paquier 2018-07-19 01:54:23 Re: Add SKIP LOCKED to VACUUM and ANALYZE