From: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Jeff Davis <pgsql(at)j-davis(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Subject: | Re: Improve WALRead() to suck data directly from WAL buffers when possible |
Date: | 2023-11-08 07:40:34 |
Message-ID: | CALj2ACUSydTHPcucftm4XdiJ4osH8zeNENyADd-zfDzEMf6akA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Nov 4, 2023 at 6:13 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> > + cur_lsn = GetFlushRecPtr(NULL);
> > + if (unlikely(startptr > cur_lsn))
> > + elog(ERROR, "WAL start LSN %X/%X specified for reading from WAL buffers must be less than current database system WAL LSN %X/%X",
> > + LSN_FORMAT_ARGS(startptr), LSN_FORMAT_ARGS(cur_lsn));
>
> Hm, why does this check belong here? For some tools it might be legitimate to
> read the WAL before it was fully flushed.
Agreed and removed the check.
> > + /*
> > + * Holding WALBufMappingLock ensures inserters don't overwrite this value
> > + * while we are reading it. We try to acquire it in shared mode so that
> > + * the concurrent WAL readers are also allowed. We try to do as less work
> > + * as possible while holding the lock to not impact concurrent WAL writers
> > + * much. We quickly exit to not cause any contention, if the lock isn't
> > + * immediately available.
> > + */
> > + if (!LWLockConditionalAcquire(WALBufMappingLock, LW_SHARED))
> > + return 0;
>
> That seems problematic - that lock is often heavily contended. We could
> instead check IsWALRecordAvailableInXLogBuffers() once before reading the
> page, then read the page contents *without* holding a lock, and then check
> IsWALRecordAvailableInXLogBuffers() again - if the page was replaced in the
> interim we read bogus data, but that's a bit of a wasted effort.
In the new approach described upthread here
https://www.postgresql.org/message-id/c3455ab9da42e09ca9d059879b5c512b2d1f9681.camel%40j-davis.com,
there's no lock required for reading from WAL buffers. PSA patches for
more details.
> > + /*
> > + * The fact that we acquire WALBufMappingLock while reading the WAL
> > + * buffer page itself guarantees that no one else initializes it or
> > + * makes it ready for next use in AdvanceXLInsertBuffer(). However, we
> > + * need to ensure that we are not reading a page that just got
> > + * initialized. For this, we look at the needed page header.
> > + */
> > + phdr = (XLogPageHeader) page;
> > +
> > + /* Return, if WAL buffer page doesn't look valid. */
> > + if (!(phdr->xlp_magic == XLOG_PAGE_MAGIC &&
> > + phdr->xlp_pageaddr == (ptr - (ptr % XLOG_BLCKSZ)) &&
> > + phdr->xlp_tli == tli))
> > + break;
>
> I don't think this code should ever encounter a page where this is not the
> case? We particularly shouldn't do so silently, seems that could hide all
> kinds of problems.
I think it's possible to read a "just got initialized" page with the
new approach to read WAL buffer pages without WALBufMappingLock if the
page is read right after it is initialized and xlblocks is filled in
AdvanceXLInsertBuffer() but before actual WAL is written.
> > + /*
> > + * Note that we don't perform all page header checks here to avoid
> > + * extra work in production builds; callers will anyway do those
> > + * checks extensively. However, in an assert-enabled build, we perform
> > + * all the checks here and raise an error if failed.
> > + */
>
> Why?
Minimal page header checks are performed to ensure we don't read the
page that just got initialized unlike what
XLogReaderValidatePageHeader(). Are you suggesting to remove page
header checks with XLogReaderValidatePageHeader() for assert-enabled
builds? Or are you suggesting to do page header checks with
XLogReaderValidatePageHeader() for production builds too?
PSA v16 patch set. Note that 0004 patch adds support for WAL read
stats (both from WAL file and WAL buffers) to walsenders and may not
necessarily the best approach to capture WAL read stats in light of
https://www.postgresql.org/message-id/CALj2ACU_f5_c8F%2BxyNR4HURjG%3DJziiz07wCpQc%3DAqAJUFh7%2B8w%40mail.gmail.com
which adds WAL read/write/fsync stats to pg_stat_io.
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment | Content-Type | Size |
---|---|---|
v16-0001-Use-64-bit-atomics-for-xlblocks-array-elements.patch | application/x-patch | 7.2 KB |
v16-0002-Allow-WAL-reading-from-WAL-buffers.patch | application/x-patch | 9.4 KB |
v16-0003-Add-test-module-for-verifying-WAL-read-from-WAL-.patch | application/x-patch | 8.9 KB |
v16-0004-Add-support-for-collecting-WAL-read-stats-for-wa.patch | application/octet-stream | 19.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Xiang Gao | 2023-11-08 07:44:11 | Question about the Implementation of vector32_is_highbit_set on ARM |
Previous Message | Tatsuo Ishii | 2023-11-08 07:37:05 | Re: Row pattern recognition |