Re: Improve WALRead() to suck data directly from WAL buffers when possible

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: Improve WALRead() to suck data directly from WAL buffers when possible
Date: 2023-11-04 22:45:00
Message-ID: CALj2ACV+pmifCOktKrr+nYuZSo1R-Ue2TA-iuzTGwpHGw=EQJg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Nov 5, 2023 at 2:57 AM Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
>
> On Sat, 2023-11-04 at 20:55 +0530, Bharath Rupireddy wrote:
> > + XLogRecPtr EndPtr =
> > pg_atomic_read_u64(&XLogCtl->xlblocks[curridx]);
> > +
> > + /*
> > + * xlblocks value can be InvalidXLogRecPtr before
> > the new WAL buffer
> > + * page gets initialized in AdvanceXLInsertBuffer.
> > In such a case
> > + * re-read the xlblocks value under the lock to
> > ensure the correct
> > + * value is read.
> > + */
> > + if (unlikely(XLogRecPtrIsInvalid(EndPtr)))
> > + {
> > + LWLockAcquire(WALBufMappingLock,
> > LW_EXCLUSIVE);
> > + EndPtr = pg_atomic_read_u64(&XLogCtl-
> > >xlblocks[curridx]);
> > + LWLockRelease(WALBufMappingLock);
> > + }
> > +
> > + Assert(!XLogRecPtrIsInvalid(EndPtr));
>
> Can that really happen? If the EndPtr is invalid, that means the page
> is in the process of being cleared, so the contents of the page are
> undefined at that time, right?

My initial thoughts were this way - xlblocks is being read without
holding WALBufMappingLock in XLogWrite() and since we write
InvalidXLogRecPtr to xlblocks array elements temporarily before
MemSet-ting the page in AdvanceXLInsertBuffer(), the PANIC "xlog write
request %X/%X is past end of log %X/%X" might get hit if EndPtr read
from xlblocks is InvalidXLogRecPtr. FWIW, an Assert(false); within the
if (unlikely(XLogRecPtrIsInvalid(EndPtr))) block didn't hit in make
check-world.

It looks like my above understanding isn't correct because it can
never happen that the page that's being written to the WAL file gets
initialized in AdvanceXLInsertBuffer(). I'll remove this piece of code
in next version of the patch unless there are any other thoughts.

[1]
/*
* Within the loop, curridx is the cache block index of the page to
* consider writing. Begin at the buffer containing the next unwritten
* page, or last partially written page.
*/
curridx = XLogRecPtrToBufIdx(LogwrtResult.Write);

while (LogwrtResult.Write < WriteRqst.Write)
{
/*
* Make sure we're not ahead of the insert process. This could happen
* if we're passed a bogus WriteRqst.Write that is past the end of the
* last page that's been initialized by AdvanceXLInsertBuffer.
*/
XLogRecPtr EndPtr = pg_atomic_read_u64(&XLogCtl->xlblocks[curridx]);

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-11-04 23:56:05 Re: pg_upgrade and logical replication
Previous Message Bharath Rupireddy 2023-11-04 22:30:00 Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)