Re: Streaming read-ready sequential scan code

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Streaming read-ready sequential scan code
Date: 2024-04-07 21:36:59
Message-ID: CA+hUKGJDdbmJGCm6CsnkuRKe24jGE1L22eBS3eAG=POXM7cxxA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Apr 7, 2024 at 1:34 PM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
> Attached v13 0001 is your fix and 0002 is a new version of the
> sequential scan streaming read user. Off-list Andres mentioned that I
> really ought to separate the parallel and serial sequential scan users
> into two different callbacks. I've done that in the attached. It
> actually makes the code used by the callbacks nicer and more readable
> anyway (putting aside performance). I was able to measure a small
> performance difference as well.

Thanks. I changed a couple of very trivial things before pushing.

+ BlockNumber (*cb) (ReadStream *pgsr, void *private_data,
+ void *per_buffer_data);

This type has a friendly name: ReadStreamBlockNumberCB.

+ scan->rs_read_stream =
read_stream_begin_relation(READ_STREAM_SEQUENTIAL,

I've been on the fence about that flag for sequential scan... Some
days I want to consider changing to READ_STREAM_DEFAULT and relying on
our "anti-heuristics" to suppress advice, which would work out the
same in most cases but might occasionally win big. It might also
hurt, I dunno, so I suspect we'd have to make it better first, which
my patch in [1] was a first swing at, but I haven't researched that
enough. So, kept this way!

- * Read the next block of the scan relation into a buffer and pin that buffer
- * before saving it in the scan descriptor.
+ * Read the next block of the scan relation from the read stream and pin that
+ * buffer before saving it in the scan descriptor.

Changed to:

* Read the next block of the scan relation from the read stream and save it
* in the scan descriptor. It is already pinned.

+static BlockNumber
+heap_scan_stream_read_next_parallel(ReadStream *pgsr, void *private_data,
+ void *per_buffer_data)

Changed argument names to match the function pointer type definition,
"stream" and "callback_private_data".

BTW looking at the branching in read-stream user patches that have an
initialisation step like yours, I wonder if it might every make sense
to be able to change the callback on the fly from inside the callback,
so that you finish up with a branchless one doing most of the work. I
have no idea if it's worth it...

[1] https://www.postgresql.org/message-id/CA%2BhUKGLLFvou5rx5FDhm-Pc9r4STQTFFmrx6SUV%2Bvk8fwMbreA%40mail.gmail.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2024-04-07 21:40:01 Re: Table AM Interface Enhancements
Previous Message Nazir Bilal Yavuz 2024-04-07 21:30:18 Re: Streaming I/O, vectored I/O (WIP)