Re: Streaming read-ready sequential scan code

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Streaming read-ready sequential scan code
Date: 2024-04-04 01:08:34
Message-ID: CAApHDvoZZ59skkkxr03ygryRR3xEoJzrApw_9vEQbzrLDV=cOg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 4 Apr 2024 at 06:03, Melanie Plageman <melanieplageman(at)gmail(dot)com> wrote:
> Attached v8 is rebased over current master (which now has the
> streaming read API).

I've looked at the v8-0001 patch.

I wasn't too keen on heapbuildvis() as a function name for an external
function. Since it also does pruning work, it seemed weird to make it
sound like it only did visibility work. Per our offline discussion
about names, I've changed it to what you suggested which is
heap_page_prep().

Aside from that, there was an outdated comment: "In pageatatime mode,
heapgetpage() already did visibility checks," which was no longer true
as that's done in heapbuildvis() (now heap_page_prep()).

I also did a round of comment adjustments as there were a few things I
didn't like, e.g:

+ * heapfetchbuf - subroutine for heapgettup()

This is also used in heapgettup_pagemode(), so I thought it was a bad
idea for a function to list places it thinks it's being called. I
also thought it redundant to write "This routine" in the function head
comment. I think "this routine" is implied by the context. I ended up
with:

/*
* heapfetchbuf - read and pin the given MAIN_FORKNUM block number.
*
* Read the specified block of the scan relation into a buffer and pin that
* buffer before saving it in the scan descriptor.
*/

I'm happy with your changes to heapam_scan_sample_next_block(). I did
adjust the comment above CHECK_FOR_INTERRUPTS() so it was effectively
the same as the seqscan version, just with "seqscan" swapped for
"sample scan".

The only other thing I adjusted there was to use "blockno" in some
places where you were using hscan->rs_cblock. These all come after
the "hscan->rs_cblock = blockno;" line. My thoughts here are that this
is more likely to avoid reading the value from the struct again if the
compiler isn't happy that the two values are still equivalent for some
reason. Even if the compiler is happy today, it would only take a
code change to pass hscan to some external function for the compiler
to perhaps be uncertain if that function has made an adjustment to
rs_cblock and go with the safe option of pulling the value out the
struct again which is a little more expensive as it requires some
maths to figure out the offset.

I've attached v9-0001 and a delta of just my changes from v8.

David

Attachment Content-Type Size
v9-0001-Split-heapgetpage-into-two-parts.patch text/plain 8.5 KB
v8_to_v9-0001_delta.patch.txt text/plain 5.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-04-04 01:14:52 Re: Statistics Import and Export
Previous Message Michael Paquier 2024-04-04 01:05:25 Re: Allow non-superuser to cancel superuser tasks.