From: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Alexander Lakhin <exclusion(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: BitmapHeapScan streaming read user and prelim refactoring |
Date: | 2025-03-23 17:27:28 |
Message-ID: | CAAKRu_bEZPgAsMK49Kn6JP_+Zq-d6rQP7enP7nk1CYLbHKMv_A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Mar 22, 2025 at 5:04 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> The problem is that sometimes recheck is performed for pending empty
> tuples. The comment about empty tuples says:
>
> /*
> * Bitmap is exhausted. Time to emit empty tuples if relevant. We emit
> * all empty tuples at the end instead of emitting them per block we
> * skip fetching. This is necessary because the streaming read API
> * will only return TBMIterateResults for blocks actually fetched.
> * When we skip fetching a block, we keep track of how many empty
> * tuples to emit at the end of the BitmapHeapScan. We do not recheck
> * all NULL tuples.
> */
> *recheck = false;
> return bscan->rs_empty_tuples_pending > 0;
>
> But we actually emit empty tuples at each page boundary:
>
> /*
> * Out of range? If so, nothing more to look at on this page
> */
> while (hscan->rs_cindex >= hscan->rs_ntuples)
> {
> /*
> * Emit empty tuples before advancing to the next block
> */
> if (bscan->rs_empty_tuples_pending > 0)
> {
> /*
> * If we don't have to fetch the tuple, just return nulls.
> */
> ExecStoreAllNullTuple(slot);
> bscan->rs_empty_tuples_pending--;
> return true;
> }
>
> /*
> * Returns false if the bitmap is exhausted and there are no further
> * blocks we need to scan.
> */
> if (!BitmapHeapScanNextBlock(scan, recheck, lossy_pages, exact_pages))
> return false;
> }
>
> So we don't just process tuples at the end of the scan, but at each page
> boundary.
Whoops! I think an earlier version of the patch did emit all the empty
tuples at the end, but I changed the control flow without moving the
recheck = false and removing/modifying that comment. Quite
embarrassing.
> This leads to wrong results whenever there is a rs_empty_tuples_pending > 0
> after a previous page that needed rechecks, because nothing will reset
> *recheck = false.
>
> And indeed, if I add a *recheck = false in that inside the
> /*
> * Emit empty tuples before advancing to the next block
> */
> if (bscan->rs_empty_tuples_pending > 0)
> {
> the problem goes away.
>
>
> A fix should do slightly more, given that the "at the end" comment is wrong.
>
> But I'm wondering if it's worth fixing it, or if we should just rip the logic
> out alltogether, due to the whole VM checking being unsound as we learned in
> the thread I referenced earlie, without even bothering to fix the bug here.
Perhaps it is better I just fix it since ripping out the skip fetch
optimization has to be backported and even though that will look very
different on master than on backbranches, I wonder if people will look
for a "clean" commit on master which removes the feature.
- Melanie
From | Date | Subject | |
---|---|---|---|
Next Message | Melanie Plageman | 2025-03-23 17:30:07 | Re: Regression test postgres_fdw might fail due to autovacuum |
Previous Message | Fujii Masao | 2025-03-23 17:26:35 | Re: Regression test postgres_fdw might fail due to autovacuum |