Re: Improve WALRead() to suck data directly from WAL buffers when possible

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Improve WALRead() to suck data directly from WAL buffers when possible
Date: 2022-12-25 11:25:26
Message-ID: CAFiTN-uCG-+axA4r02_9OYgKJU_GkJ1=tBCxq_5UvyWmWEBywA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 23, 2022 at 3:46 PM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> On Mon, Dec 12, 2022 at 8:27 AM Kyotaro Horiguchi
> <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> >
>
> Thanks for providing thoughts.
>
> > At Fri, 9 Dec 2022 14:33:39 +0530, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote in
> > > The patch introduces concurrent readers for the WAL buffers, so far
> > > only there are concurrent writers. In the patch, WALRead() takes just
> > > one lock (WALBufMappingLock) in shared mode to enable concurrent
> > > readers and does minimal things - checks if the requested WAL page is
> > > present in WAL buffers, if so, copies the page and releases the lock.
> > > I think taking just WALBufMappingLock is enough here as the concurrent
> > > writers depend on it to initialize and replace a page in WAL buffers.
> > >
> > > I'll add this to the next commitfest.
> > >
> > > Thoughts?
> >
> > This adds copying of the whole page (at least) at every WAL *record*
> > read,
>
> In the worst case yes, but that may not always be true. On a typical
> production server with decent write traffic, it happens that the
> callers of WALRead() read a full WAL page of size XLOG_BLCKSZ bytes or
> MAX_SEND_SIZE bytes.

I agree with this.

> > This patch copies the bleeding edge WAL page without recording the
> > (next) insertion point nor checking whether all in-progress insertion
> > behind the target LSN have finished. Thus the copied page may have
> > holes. That being said, the sequential-reading nature and the fact
> > that WAL buffers are zero-initialized may make it work for recovery,
> > but I don't think this also works for replication.
>
> WALRead() callers are smart enough to take the flushed bytes only.
> Although they read the whole WAL page, they calculate the valid bytes.

Right

On first read the patch looks good, although it needs some more
thoughts on 'XXX' comments in the patch.

And also I do not like that XLogReadFromBuffers() is using 3 bools
hit/partial hit/miss, instead of this we can use an enum or some
tristate variable, I think that will be cleaner.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-12-25 12:01:53 Re: [PATCH] Enable using llvm jitlink as an alternative llvm jit linker of old Rtdyld.
Previous Message Andrew Dunstan 2022-12-24 20:23:38 Re: Error-safe user functions