From: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
---|---|
To: | Jeff Davis <pgsql(at)j-davis(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Subject: | Re: Improve WALRead() to suck data directly from WAL buffers when possible |
Date: | 2024-01-26 14:01:24 |
Message-ID: | CALj2ACW65mqn6Ukv57SqDTMzAJgd1N_AdQtDgy+gMDqu6v618Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jan 26, 2024 at 8:31 AM Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
>
> > PSA v20 patch set.
>
> 0001 is very close. I have the following suggestions:
>
> * Don't just return zero. If the caller is doing something we don't
> expect, we want to fix the caller. I understand you'd like this to be
> more like a transparent optimization, and we may do that later, but I
> don't think it's a good idea to do that now.
+ if (RecoveryInProgress() ||
+ tli != GetWALInsertionTimeLine())
+ return ntotal;
+
+ Assert(!XLogRecPtrIsInvalid(startptr));
Are you suggesting to error out instead of returning 0? If yes, I
disagree with it. Because, failure to read due to unmet pre-conditions
doesn't necessarily have to be to error out. If we error out, the
immediate failure we see is in the src/bin/psql TAP test for calling
XLogReadFromBuffers when the server is in recovery. How about
returning a negative value instead of just 0 or returning true/false
just like WALRead?
> * There's currently no use for reading LSNs between Write and Insert,
> so remove the WaitXLogInsertionsToFinish() code path. That also means
> we don't need the extra emitLog parameter, so we can remove that. When
> we have a use case, we can bring it all back.
I disagree with this. I don't see anything wrong with
XLogReadFromBuffers having the capability to wait for in-progress
insertions to finish. In fact, it makes the function near-complete.
Imagine, implementing an extension (may be for fun or learning or
educational or production purposes) to read unflushed WAL directly
from WAL buffers using XLogReadFromBuffers as page_read callback with
xlogreader facility. AFAICT, I don't see a problem with
WaitXLogInsertionsToFinish logic in XLogReadFromBuffers.
FWIW, one important aspect of XLogReadFromBuffers is its ability to
read the unflushed WAL from WAL buffers. Also, see a note from Andres
here https://www.postgresql.org/message-id/20231109205836.zjoawdrn4q77yemv%40awork3.anarazel.de.
> If you agree, I can just make those adjustments (and do some final
> checking) and commit 0001. Otherwise let me know what you think.
Thanks. Please see my responses above.
> 0002: How does the test control whether the data requested is before
> the Flush pointer, the Write pointer, or the Insert pointer? What if
> the walwriter comes in and moves one of those pointers before the next
> statement is executed?
Tried to keep wal_writer quiet with wal_writer_delay=10000ms and
wal_writer_flush_after = 1GB to not to flush WAL in the background.
Also, disabled autovacuum, and set checkpoint_timeout to a higher
value. All of this is done to generate minimal WAL so that WAL buffers
don't get overwritten. Do you see any problems with it?
> Also, do you think a test module is required for
> the basic functionality in 0001, or only when we start doing more
> complex things like reading past the Flush pointer?
With WaitXLogInsertionsToFinish in XLogReadFromBuffers, we have that
capability already in. Having a separate test module ensures the code
is tested properly.
As far as the test is concerned, it verifies 2 cases:
1. Check if WAL is successfully read from WAL buffers. For this, the
test generates minimal WAL and reads from WAL buffers from the start
LSN = current insert LSN captured before the WAL generation.
2. Check with a WAL that doesn't yet exist. For this, the test reads
from WAL buffers from the start LSN = current flush LSN+16MB (a
randomly chosen higher value).
> 0003: can you explain why this is useful for wal summarizer to read
> from the buffers?
Can the WAL summarizer ever read the WAL on current TLI? I'm not so
sure about it, I haven't explored it in detail.
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2024-01-26 14:01:52 | Re: Add SPLIT PARTITION/MERGE PARTITIONS commands |
Previous Message | jian he | 2024-01-26 13:44:34 | Re: Add new error_action COPY ON_ERROR "log" |