From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, fabriziomello(at)gmail(dot)com, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Rahila Syed <rahila(dot)syed(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Minimal logical decoding on standbys |
Date: | 2021-10-29 00:44:21 |
Message-ID: | CA+TgmoZd-JqNL1-R3RJ0jQRD+-dc94X0nPJgh+dwdDF0rFuE3g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Oct 28, 2021 at 5:07 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > I think that, in order to have
> > any chance of being acceptable, this would need to be restructured so
> > that it pulls data from an existing relcache entry that is known to be
> > valid, without attempting to create a new one. That is,
> > get_rel_logical_decoding() would need to take a Relation argument, not
> > an OID.
>
> Hm? Once we have a relation we don't really need the helper function anymore.
Well, that's fine, too.
> > I also think it's super-weird that the value being logged is computed
> > using RelationIsAccessibleInLogicalDecoding(). That means that if
> > wal_level < logical, we'll set onCatalogTable = false in the xlog
> > record, regardless of whether that's true or not. Now I suppose it
> > won't matter, because presumably this field is only going to be
> > consulted for whatever purpose when logical replication is active, but
> > I object on principle to the idea of a field whose name suggests that
> > it means one thing and whose value is inconsistent with that
> > interpretation.
>
> Hm. Not sure what a good solution for this is. I don't think we should make
> the field independent of wal_level - it doesn't really mean anything with a
> lower wal_level. And it increases the illusion that the table is guaranteed to
> be a system table or something a bit. Perhaps the field name should hint at
> this being logically decoding related?
Not sure - I don't know what this is for. I did wonder if maybe it
should be testing IsCatalogRelation(relation) ||
RelationIsUsedAsCatalogTable(relation) i.e.
RelationIsAccessibleInLogicalDecoding() with the removal of the
XLogLogicalInfoActive() and RelationNeedsWAL() tests. But since I
don't know what I'm talking about, all I can say for sure right now is
that the field name and the field contents don't seem to align.
> > I also notice that 0003 deletes a comment that says "We need to force
> > hot_standby_feedback to be enabled at all times so the primary cannot
> > remove rows we need," but also that this is the only mention of
> > hot_standby_feedback in the entire patch set. If the existing comment
> > that we need to do something about that is incorrect, we should update
> > it independently of this patch set to be correct. But if the existing
> > comment is correct then there ought to be something in the patch that
> > deals with it.
>
> The patch deals with this - we'll detect the removal of row versions that
> aren't needed anymore and stop decoding. Of course you'll most of the time
> want to use hs_feedback, but sometimes it'll also just be a companion slot on
> the primary or such (think slots for failover or such).
Where and how does this happen?
> > Another part of that same deleted comment says "We need to be able to
> > correctly and quickly identify the timeline LSN belongs to," but I
> > don't see what the patch does about that, either. I'm actually not
> > sure exactly what that's talking about
>
> Hm - could you expand on what you're unclear about re LSN->timeline? It's just
> that we need to read a WAL page for a certain LSN, and for that we need the
> timeline?
I don't know - I'm trying to understand the meaning of a comment that
I think you wrote originally.
> > This function's sister, read_local_xlog_page(), contains a bunch of logic
> > that tries to make sure that we're always reading every record from the
> > right timeline, but there's nothing similar here. I think that would likely
> > have to be fixed in order for decoding to work on standbys, but maybe I'm
> > missing something.
>
> I think that part actually works, afaict they both rely on the same
> XLogReadDetermineTimeline() for that job afaict. What might be missing is
> logic to update the target timeline.
Hmm, OK, perhaps I mis-spoke, but I think we're talking about the same
thing. read_local_xlog_page() has this:
* RecoveryInProgress() will update ThisTimeLineID when it first
* notices recovery finishes, so we only have to maintain it for the
* local process until recovery ends.
*/
if (!RecoveryInProgress())
read_upto = GetFlushRecPtr();
else
read_upto = GetXLogReplayRecPtr(&ThisTimeLineID);
tli = ThisTimeLineID;
That's a bulletproof guarantee that "tli" and "ThisTimeLineID" are up
to date. The other function has nothing similar.
--
Robert Haas
EDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Masahiko Sawada | 2021-10-29 00:47:27 | Re: Skipping logical replication transactions on subscriber side |
Previous Message | Andres Freund | 2021-10-28 23:52:48 | Re: inefficient loop in StandbyReleaseLockList() |