Re: Can rs_cindex be < 0 for bitmap heap scans?

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Tomas Vondra <tv(at)fuzzy(dot)cz>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Can rs_cindex be < 0 for bitmap heap scans?
Date: 2024-12-20 02:35:58
Message-ID: CAMbWs4_bE+NscChbKWzw6HZOipCUyXfA5133qvoXQ654D3B2gQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 20, 2024 at 7:50 AM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
> On Wed, Dec 18, 2024 at 9:50 PM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
> > I think this binary search code now has a risk of underflow. If 'mid'
> > is calculated as zero, the second 'if' branch will cause 'end' to
> > underflow.

> What about this instead:
>
> diff --git a/src/backend/access/heap/heapam_handler.c
> b/src/backend/access/heap/heapam_handler.c
> index 2da4e4da13e..fb90fd869c2 100644
> --- a/src/backend/access/heap/heapam_handler.c
> +++ b/src/backend/access/heap/heapam_handler.c
> @@ -2574,11 +2574,8 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer,
>
> if (scan->rs_flags & SO_ALLOW_PAGEMODE)
> {
> - uint32 start,
> - end;
> -
> - if (hscan->rs_ntuples == 0)
> - return false;
> + uint32 start = 0,
> + end = hscan->rs_ntuples;
>
> /*
> * In pageatatime mode, heap_prepare_pagescan() already did visibility
> @@ -2589,10 +2586,8 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer,
> * in increasing order, but it's not clear that there would be enough
> * gain to justify the restriction.
> */
> - start = 0;
> - end = hscan->rs_ntuples - 1;
>
> - while (start <= end)
> + while (start < end)
> {
> uint32 mid = (start + end) / 2;
> OffsetNumber curoffset = hscan->rs_vistuples[mid];
> @@ -2600,7 +2595,7 @@ SampleHeapTupleVisible(TableScanDesc scan, Buffer buffer,
> if (tupoffset == curoffset)
> return true;
> else if (tupoffset < curoffset)
> - end = mid - 1;
> + end = mid;
> else
> start = mid + 1;
> }
>
> Or to make it easier to read, here:
>
> uint32 start = 0,
> end = hscan->rs_ntuples;
>
> while (start < end)
> {
> uint32 mid = (start + end) / 2;
> OffsetNumber curoffset = hscan->rs_vistuples[mid];
>
> if (tupoffset == curoffset)
> return true;
> else if (tupoffset < curoffset)
> end = mid;
> else
> start = mid + 1;
> }
>
> I think this gets rid of any subtraction and is still the same.

Looks correct to me.

BTW, I kind of doubt that the overflow risk fixed in 28328ec87 is a
real issue in real-world scenarios. Is it realistically possible to
have more than INT_MAX tuples fit on one heap page?

If it is, then the statement below could also be prone to overflow.

uint32 mid = (start + end) / 2;

Maybe you want to change it to:

uint32 mid = start + (end - start) / 2;

Thanks
Richard

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2024-12-20 02:53:06 Re: in BeginCopyTo make materialized view using COPY TO instead of COPY (query).
Previous Message Peter Smith 2024-12-20 02:34:45 Re: Logical Replication of sequences