From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
Cc: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> |
Subject: | Re: BitmapHeapScan streaming read user and prelim refactoring |
Date: | 2024-06-19 16:38:35 |
Message-ID: | 07243cb2-b6e5-4bee-9fc0-8ff13e0ee58b@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 6/19/24 17:55, Melanie Plageman wrote:
> On Tue, Jun 18, 2024 at 6:02 PM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>
>> I went through v22 to remind myself of what the patches do and do some
>> basic review - I have some simple questions / comments for now, nothing
>> major. I've kept the comments in separate 'review' patches, it does not
>> seem worth copying here.
>
> Thanks so much for the review!
>
> I've implemented your feedback in attached v23 except for what I
> mention below. I have not gone through each patch in the new set very
> carefully after making the changes because I think we should resolve
> the question of adding the new table scan descriptor before I do that.
> A change there will require a big rebase. Then I can go through each
> patch very carefully.
>
> From your v22b-0005-review.patch:
>
> src/backend/executor/nodeBitmapHeapscan.c | 14 ++++++++++++++
> src/include/access/tableam.h | 2 ++
> 2 files changed, 16 insertions(+)
>
> diff --git a/src/backend/executor/nodeBitmapHeapscan.c
> b/src/backend/executor/nodeBitmapHeapscan.c
> index e8b4a754434..6d7ef9ced19 100644
> --- a/src/backend/executor/nodeBitmapHeapscan.c
> +++ b/src/backend/executor/nodeBitmapHeapscan.c
> @@ -270,6 +270,20 @@ new_page:
>
> BitmapAdjustPrefetchIterator(node);
>
> + /*
> + * XXX I'm a bit unsure if this needs to be handled using
> goto. Wouldn't
> + * it be simpler / easier to understand to have two nested loops?
> + *
> + * while (true)
> + * if (!table_scan_bitmap_next_block(...)) { break; }
> + * while (table_scan_bitmap_next_tuple(...)) {
> + * ... process tuples ...
> + * }
> + *
> + * But I haven't tried implementing this.
> + */
> if (!table_scan_bitmap_next_block(scan, &node->blockno, &node->recheck,
> &node->lossy_pages,
> &node->exact_pages))
> break;
>
> We need to call table_scan_bimtap_next_block() the first time we call
> BitmapHeapNext() on each scan but all subsequent invocations of
> BitmapHeapNext() must call table_scan_bitmap_next_tuple() first each
> -- because we return from BitmapHeapNext() to yield a tuple even when
> there are more tuples on the page. I tried refactoring this a few
> different ways and personally found the goto most clear.
>
OK, I haven't tried refactoring this myself, so you're probably right.
> From your v22b-0010-review.patch:
>
> @@ -557,6 +559,8 @@ ExecReScanBitmapHeapScan(BitmapHeapScanState *node)
> table_rescan(node->ss.ss_currentScanDesc, NULL);
>
> /* release bitmaps and buffers if any */
> + /* XXX seems it should not be right after the comment, also shouldn't
> + * we still reset the prefetch_iterator field to NULL? */
> tbm_end_iterate(&node->prefetch_iterator);
> if (node->tbm)
> tbm_free(node->tbm);
>
> prefetch_iterator is a TBMIterator which is stored in the struct (as
> opposed to having a pointer to it stored in the struct).
> tbm_end_iterate() sets the actual private and shared iterator pointers
> to NULL.
>
Ah, right.
> From your v22b-0017-review.patch
>
> diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
> index 036ef29e7d5..9c711ce0eb0 100644
> --- a/src/include/access/relscan.h
> +++ b/src/include/access/relscan.h
> @@ -52,6 +52,13 @@ typedef struct TableScanDescData
> } TableScanDescData;
> typedef struct TableScanDescData *TableScanDesc;
>
> +/*
> + * XXX I don't understand why we should have this special node if we
> + * don't have special nodes for other scan types.
>
> In this case, up until the final commit (using the read stream
> interface), there are six fields required by bitmap heap scan that are
> not needed by any other user of HeapScanDescData. There are also
> several members of HeapScanDescData that are not needed by bitmap heap
> scans and all of the setup in initscan() for those fields is not
> required for bitmap heap scans.
>
> Also, because the BitmapHeapScanDesc needs information like the
> ParallelBitmapHeapState and prefetch_maximum (for use in prefetching),
> the scan_begin() callback would have to take those as parameters. I
> thought adding so much bitmap table scan-specific information to the
> generic table scan callbacks was a bad idea.
>
> Once we add the read stream API code, the number of fields required
> for bitmap heap scan that are in the scan descriptor goes down to
> three. So, perhaps we could justify having that many bitmap heap
> scan-specific fields in the HeapScanDescData.
>
> Though, I actually think we could start moving toward having
> specialized scan descriptors if the requirements for that scan type
> are appreciably different. I can't think of new code that would be
> added to the HeapScanDescData that would have to be duplicated over to
> specialized scan descriptors.
>
> With the final read stream state, I can see the argument for bloating
> the HeapScanDescData with three extra members and avoiding making new
> scan descriptors. But, for the intermediate patches which have all of
> the bitmap prefetch members pushed down into the HeapScanDescData, I
> think it is really not okay. Six members only for bitmap heap scans
> and two bitmap-specific members to begin_scan() seems bad.
>
> What I thought we plan to do is commit the refactoring patches
> sometime after the branch for 18 is cut and leave the final read
> stream patch uncommitted so we can do performance testing on it. If
> you think it is okay to have the six member bloated HeapScanDescData
> in master until we push the read stream code, I am okay with removing
> the BitmapTableScanDesc and BitmapHeapScanDesc.
>
I admit I don't have a very good idea what the ideal / desired state
look like. My comment is motivated solely by the feeling that it seems
strange to have one struct serving most scan types, and then a special
struct for one particular scan type ...
> + * XXX Also, maybe this should do the naming convention with Data at
> + * the end (mostly for consistency).
> + */
> typedef struct BitmapTableScanDesc
> {
> Relation rs_rd; /* heap relation descriptor */
>
> I really want to move away from these Data typedefs. I find them so
> confusing as a developer, but it's hard to justify ripping out the
> existing ones because of code churn. If we add new scan descriptors, I
> had really hoped to start using a different pattern.
>
Perhaps, I understand that. I'm not a huge fan of Data structs myself,
but I'm not sure it's a great idea to do both things in the same area of
code. That's guaranteed to be confusing for everyone ...
If we want to move away from that, I'd rather rename the nearby structs
and accept the code churn.
> From your v22b-0025-review.patch
>
> diff --git a/src/backend/access/table/tableamapi.c
> b/src/backend/access/table/tableamapi.c
> index a47527d490a..379b7df619e 100644
> --- a/src/backend/access/table/tableamapi.c
> +++ b/src/backend/access/table/tableamapi.c
> @@ -91,6 +91,9 @@ GetTableAmRoutine(Oid amhandler)
>
> Assert(routine->relation_estimate_size != NULL);
>
> + /* XXX shouldn't this check that _block is not set without _tuple?
> + * Also, the commit message says _block is "local helper" but then
> + * why would it be part of TableAmRoutine? */
> Assert(routine->scan_sample_next_block != NULL);
> Assert(routine->scan_sample_next_tuple != NULL);
>
> scan_bitmap_next_block() is removed as a table AM callback here, so we
> don't check if it is set. We do still check if scan_bitmap_next_tuple
> is set if amgetbitmap is not NULL. heapam_scan_bitmap_next_block() is
> now a local helper for heapam_scan_bitmap_next_tuple(). Perhaps I
> should change the name to something like heap_scan_bitmap_next_block()
> to make it clear it is not an implementation of a table AM callback?
>
I'm quite confused by this. How could it not be am AM callback when it's
in the AM routine?
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2024-06-19 16:51:31 | Re: BitmapHeapScan streaming read user and prelim refactoring |
Previous Message | Tom Lane | 2024-06-19 16:36:34 | Re: What is a typical precision of gettimeofday()? |