Re: Some read stream improvements

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Some read stream improvements
Date: 2025-02-17 05:57:46
Message-ID: CA+hUKGJ43PGTH=69=d=c=f8cwGeug548pMxN-jQ6_CohGayUQA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 17, 2025 at 5:55 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> The solution we agreed on is to introduce a way for StartReadBuffers()
> to communicate with future calls, and "forward" pinned buffers between
> calls. The function arguments don't change, but its "buffers"
> argument becomes an in/out array: one StartReadBuffers() call can
> leave extra pinned buffers after the ones that were included in a
> short read (*nblocks), and then when you retry (or possibly extend)
> the rest of the read, you have to pass them back in. That is easy for
> the read stream code, as it can just leave them in its circular queue
> for the next call to take as input. It only needs to be aware of them
> for pin limit accounting and stream reset (including early shutdown).

BTW here's a small historical footnote about that: I had another
solution to the same problem in the original stream proposal[1], which
used a three-step bufmgr API. You had to call PrepareReadBuffer() for
each block you intended to read, and then StartReadBuffers() for each
cluster of adjacent misses it reported, and finally WaitReadBuffers().
Hits would require only the first step. That was intended to allow
the stream to manage the prepared buffers itself, short reads would
leave some of them for a later call. Based on review feedback, I
simplified to arrive at the two-step API, ie just start and wait, and
PrepareReadBuffer() became the private helper function
PinBufferForBlock(). I had to invent the "hide-the-trailing-hit"
trick to avoid unpin/repin sequence in the two-step API, thinking we
might eventually want to consider the three-step API again later.
This new design achieves the same end result: buffers that can't be
part of an I/O stay pinned and in the stream's queue ready for the
next call, just like "prepared" buffers in the early prototypes, but
it keeps the two-step bufmgr API and calls them "forwarded".

[1] https://www.postgresql.org/message-id/flat/CA%2BhUKGJkOiOCa%2Bmag4BF%2BzHo7qo%3Do9CFheB8%3Dg6uT5TUm2gkvA%40mail.gmail.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2025-02-17 05:59:24 Re: Introduce XID age and inactive timeout based replication slot invalidation
Previous Message Kirill Reshke 2025-02-17 05:55:26 Re: Some read stream improvements