Re: pg_replication_slot_advance to return NULL instead of 0/0 if slot not advanced

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Postgres hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_replication_slot_advance to return NULL instead of 0/0 if slot not advanced
Date: 2018-06-06 02:04:39
Message-ID: 20180606020439.GG1442@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 05, 2018 at 01:00:30PM +0200, Petr Jelinek wrote:
> I didn't say anything about CreateDecodingContext though. I am talking
> about the fact that we then use the same variable as input to
> XLogReadRecord later in the logical slot code path. The whole point of
> having restart_lsn for logical slot is to have correct start point for
> XLogReadRecord so using the confirmed_flush there is wrong (and has been
> wrong since the original commit).

OK, I finally see the point you are coming at after a couple of hours
brainstorming about it, and indeed using confirmed_lsn is logically
incorrect so the current code is inconsistent with what
DecodingContextFindStartpoint() does, so we rely on restart_lsn to scan
for all the records and the decoder state is here to make sure that the
slot's confirmed_lsn is updated to a consistent position. I have added
your feedback in the attached (indented and such), which results in some
simplifications with a couple of routines.

I am attaching as well the patch I sent yesterday. 0001 is candidate
for a back-patch, 0002 is for HEAD to fix the slot advance stuff.

There is another open item related to slot advancing here:
https://www.postgresql.org/message-id/2840048a-1184-417a-9da8-3299d207a1d7%40postgrespro.ru
And per my tests the patch is solving this item as well. I have just
used the test mentioned in the thread which:
1) creates a slot
2) does some activity to generate a couple of WAL pages
3) advances the slot at page boundary
4) Moves again the slot.
This test crashes on HEAD at step 4, and not with the attached.

What do you think?
--
Michael

Attachment Content-Type Size
0001-Fix-and-document-lock-handling-for-in-memory-replica.patch text/x-diff 4.7 KB
0002-Fix-a-couple-of-bugs-with-replication-slot-advancing.patch text/x-diff 5.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dennis Gearon 2018-06-06 02:20:11 Re: Code of Conduct plan
Previous Message Joshua D. Drake 2018-06-06 01:49:40 Re: Code of Conduct plan