Re: Attempt to consolidate reading of XLOG page

From: Antonin Houska <ah(at)cybertec(dot)at>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: alvherre(at)2ndquadrant(dot)com, thomas(dot)munro(at)gmail(dot)com, robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Attempt to consolidate reading of XLOG page
Date: 2019-10-02 07:16:10
Message-ID: 27140.1570000570@antos
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote:

> At Tue, 01 Oct 2019 08:28:03 +0200, Antonin Houska <ah(at)cybertec(dot)at> wrote in <2188(dot)1569911283(at)antos>
> > Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote:

> > > > The problem of walsender.c is that its implementation of XLogRead() does not
> > > > care about the TLI of the previous read. If the behavior of the new, generic
> > > > implementation should be exactly the same, we need to tell XLogRead() that in
> > > > some cases it also should not compare the current TLI to the previous
> > > > one. That's why I tried to use the NULL pointer, or the InvalidTimeLineID
> > > > earlier.
> > >
> > > Physical wal sender doesn't switch TLI. So I don't think the
> > > behavior doesn't harm (or doesn't fire). openSegment holds the
> > > TLI set at the first call. (Even if future wal sender switches
> > > TLI, the behavior should be needed.)
> >
> > Note that walsender.c:XLogRead() has no TLI argument, however the XLogRead()
> > introduced by the patch does have one. What should be passed for TLI to the
> > new implementation if it's called from walsender.c? I f the check for a segment
> > change looks like this (here "tli" is the argument representing the desired
> > TLI)
>
> TLI is mandatory to generate a wal file name so it must be passed
> to the function anyways. In the current code it is sendTimeLine
> for the walsender.c:XLogRead(). logical_read_xlog_page sets the
> variable very time immediately before calling
> XLogRead(). CreateReplicationSlot and StartReplication set the
> variable to desired TLI immediately before calling and once it is
> set by StartReplication, it is not changed by XLogSendPhysical
> and wal sender ends at the end of the current timeline. In the
> XLogRead, the value is copied to sendSeg->ws_tli when the file
> for the new timeline is read.

Are you saying that we should pass sendTimeLine to XLogRead()? I think it's
not always correct because sendSeg->ws_tli is sometimes assigned
sendTimeLineNextTLI, so the test "tli != seg->ws_tli" in

> > if (seg->ws_file < 0 ||
> > !XLByteInSeg(recptr, seg->ws_segno, segcxt->ws_segsize) ||
> > tli != seg->ws_tli)
> > {
> > XLogSegNo nextSegNo;

could pass occasionally.

> Mmm. ws_file must be -1 in the case? tli != seg->ws_tli is true
> but seg->ws_file < 0 is also always true at the time. In other
> words, the "tli != seg->ws_tli" is not even evaluated.
>
> If wal sender had an open file (ws_file >= 0) and the new TLI is
> different from ws_tli, it would be the sign of a serious bug.

So we can probably pass ws_tli as the "new TLI" when calling the new
XLogRead() from walsender.c. Is that what you try to say? I need to think
about it more but it sounds like a good idea.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-10-02 07:17:15 Re: Inconsistent usage of BACKEND_* symbols
Previous Message Stephen Frost 2019-10-02 07:09:30 Re: some PostgreSQL 12 release notes comments