Re: BitmapHeapScan streaming read user and prelim refactoring

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Subject: Re: BitmapHeapScan streaming read user and prelim refactoring
Date: 2024-08-26 13:37:08
Message-ID: 063e4eb4-32d9-439e-a0b1-75565a9835a8@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 19/06/2024 18:55, Melanie Plageman wrote:
> On Tue, Jun 18, 2024 at 6:02 PM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>
>> I went through v22 to remind myself of what the patches do and do some
>> basic review - I have some simple questions / comments for now, nothing
>> major. I've kept the comments in separate 'review' patches, it does not
>> seem worth copying here.
>
> Thanks so much for the review!
>
> I've implemented your feedback in attached v23 except for what I
> mention below. I have not gone through each patch in the new set very
> carefully after making the changes because I think we should resolve
> the question of adding the new table scan descriptor before I do that.
> A change there will require a big rebase. Then I can go through each
> patch very carefully.
Had a quick look at this after a long pause. I only looked at the first
few, hoping that reviewing them would allow you to commit at least some
of them, making the rest easier.

v23-0001-table_scan_bitmap_next_block-counts-lossy-and-ex.patch

Looks good to me. (I'm not sure if this would be a net positive change
on its own, but it's needed by the later patch so OK)

v23-0002-Remove-table_scan_bitmap_next_tuple-parameter-tb.patch

LGTM

v23-0003-Make-table_scan_bitmap_next_block-async-friendly.patch

> @@ -1955,19 +1954,26 @@ table_relation_estimate_size(Relation rel, int32 *attr_widths,
> */
>
> /*
> - * Prepare to fetch / check / return tuples from `tbmres->blockno` as part of a
> - * bitmap table scan. `scan` needs to have been started via
> - * table_beginscan_bm(). Returns false if there are no tuples to be found on
> - * the page, true otherwise. `lossy_pages` is incremented if the block's
> - * representation in the bitmap is lossy; otherwise, `exact_pages` is
> - * incremented.
> + * Prepare to fetch / check / return tuples as part of a bitmap table scan.
> + * `scan` needs to have been started via table_beginscan_bm(). Returns false if
> + * there are no more blocks in the bitmap, true otherwise. `lossy_pages` is
> + * incremented if the block's representation in the bitmap is lossy; otherwise,
> + * `exact_pages` is incremented.
> + *
> + * `recheck` is set by the table AM to indicate whether or not the tuples
> + * from this block should be rechecked. Tuples from lossy pages will always
> + * need to be rechecked, but some non-lossy pages' tuples may also require
> + * recheck.
> + *
> + * `blockno` is only used in bitmap table scan code to validate that the
> + * prefetch block is staying ahead of the current block.
> *
> * Note, this is an optionally implemented function, therefore should only be
> * used after verifying the presence (at plan time or such).
> */
> static inline bool
> table_scan_bitmap_next_block(TableScanDesc scan,
> - struct TBMIterateResult *tbmres,
> + BlockNumber *blockno, bool *recheck,
> long *lossy_pages,
> long *exact_pages)
> {

The new comment doesn't really explain what *blockno means. Is it an
input or output parameter? How is it used with the prefetching?

v23-0004-Add-common-interface-for-TBMIterators.patch

> +/*
> + * Start iteration on a shared or private bitmap iterator. Note that tbm will
> + * only be provided by private BitmapHeapScan callers. dsa and dsp will only be
> + * provided by parallel BitmapHeapScan callers. For shared callers, one process
> + * must already have called tbm_prepare_shared_iterate() to create and set up
> + * the TBMSharedIteratorState. The TBMIterator is passed by reference to
> + * accommodate callers who would like to allocate it inside an existing struct.
> + */
> +void
> +tbm_begin_iterate(TBMIterator *iterator, TIDBitmap *tbm,
> + dsa_area *dsa, dsa_pointer dsp)
> +{
> + Assert(iterator);
> +
> + iterator->private_iterator = NULL;
> + iterator->shared_iterator = NULL;
> + iterator->exhausted = false;
> +
> + /* Allocate a private iterator and attach the shared state to it */
> + if (DsaPointerIsValid(dsp))
> + iterator->shared_iterator = tbm_attach_shared_iterate(dsa, dsp);
> + else
> + iterator->private_iterator = tbm_begin_private_iterate(tbm);
> +}

Hmm, I haven't looked at how this is used the later patches, but a
function signature where some parameters are used or not depending on
the situation seems a bit awkward. Perhaps it would be better to let the
caller call tbm_attach_shared_iterate(dsa, dsp) or
tbm_begin_private_iterate(tbm), and provide a function to turn that into
a TBMIterator? Something like:

TBMIterator *tbm_iterator_from_shared_iterator(TBMSharedIterator *);
TBMIterator *tbm_iterator_from_private_iterator(TBMPrivateIterator *);

Does tbm_iterator() really need the 'exhausted' flag? The private and
shared variants work without one.

+1 on this patch in general, and I have no objections to its current
form either, the above are just suggestions to consider.

v23-0006-BitmapHeapScan-uses-unified-iterator.patch

Makes sense. (Might be better to squash this with the previous patch)

v23-0007-BitmapHeapScan-Make-prefetch-sync-error-more-det.patch

LGTM

v23-0008-Push-current-scan-descriptor-into-specialized-sc.patch
v23-0009-Remove-ss_current-prefix-from-ss_currentScanDesc.patch

LGTM. I would squash these together.

v23-0010-Add-scan_in_progress-to-BitmapHeapScanState.patch

> --- a/src/include/nodes/execnodes.h
> +++ b/src/include/nodes/execnodes.h
> @@ -1804,6 +1804,7 @@ typedef struct ParallelBitmapHeapState
> * prefetch_target current target prefetch distance
> * prefetch_maximum maximum value for prefetch_target
> * initialized is node is ready to iterate
> + * scan_in_progress is this a rescan
> * pstate shared state for parallel bitmap scan
> * recheck do current page's tuples need recheck
> * blockno used to validate pf and current block in sync
> @@ -1824,6 +1825,7 @@ typedef struct BitmapHeapScanState
> int prefetch_target;
> int prefetch_maximum;
> bool initialized;
> + bool scan_in_progress;
> ParallelBitmapHeapState *pstate;
> bool recheck;
> BlockNumber blockno;

Hmm, the "is this a rescan" comment sounds inaccurate, because it is set
as soon as the scan is started, not only when rescanning. Other than
that, LGTM.

--
Heikki Linnakangas
Neon (https://neon.tech)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Lakhin 2024-08-26 14:00:00 Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Previous Message Amit Kapila 2024-08-26 13:35:27 Re: Add contrib/pg_logicalsnapinspect