Re: BitmapHeapScan streaming read user and prelim refactoring

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Subject: Re: BitmapHeapScan streaming read user and prelim refactoring
Date: 2024-04-07 11:37:58
Message-ID: 2df2fd33-20f1-41ce-951b-6fc94ff9acb7@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 4/7/24 06:17, Melanie Plageman wrote:
> On Sun, Apr 07, 2024 at 02:27:43AM +0200, Tomas Vondra wrote:
>> On 4/6/24 23:34, Melanie Plageman wrote:
>>> ...
>>>>
>>>> I realized it makes more sense to add a FIXME (I used XXX. I'm not when
>>>> to use what) with a link to the message where Andres describes why he
>>>> thinks it is a bug. If we plan on fixing it, it is good to have a record
>>>> of that. And it makes it easier to put a clear and accurate comment.
>>>> Done in 0009.
>>>>
>>>>> OK, thanks. If think 0001-0008 are ready to go, with some minor tweaks
>>>>> per above (tuple vs. tuples etc.), and the question about the recheck
>>>>> flag. If you can do these tweaks, I'll get that committed today and we
>>>>> can try to get a couple more patches in tomorrow.
>>>
>>> Attached v19 rebases the rest of the commits from v17 over the first
>>> nine patches from v18. All patches 0001-0009 are unchanged from v18. I
>>> have made updates and done cleanup on 0010-0021.
>>>
>>
>> I've pushed 0001-0005, I'll get back to this tomorrow and see how much
>> more we can get in for v17.
>
> Thanks! I thought about it a bit more, and I got worried about the
>
> Assert(scan->rs_empty_tuples_pending == 0);
>
> in heap_rescan() and heap_endscan().
>
> I was worried if we don't complete the scan it could end up tripping
> incorrectly.
>
> I tried to come up with a query which didn't end up emitting all of the
> tuples on the page (using a LIMIT clause), but I struggled to come up
> with an example that qualified for the skip fetch optimization and also
> returned before completing the scan.
>
> I could work a bit harder tomorrow to try and come up with something.
> However, I think it might be safer to just change these to:
>
> scan->rs_empty_tuples_pending = 0
>

Hmmm, good point. I haven't tried, but wouldn't something like "SELECT 1
FROM t WHERE column = X LIMIT 1" do the trick? Probably in a join, as a
correlated subquery?

It seemed OK to me and the buildfarm did not turn red, so I'd leave this
until after the code freeze.

>> What bothers me on 0006-0008 is that the justification in the commit
>> messages is "future commit will do something". I think it's fine to have
>> a separate "prepareatory" patches (I really like how you structured the
>> patches this way), but it'd be good to have them right before that
>> "future" commit - I'd like not to have one in v17 and then the "future
>> commit" in v18, because that's annoying complication for backpatching,
>> (and probably also when implementing the AM?) etc.
>
> Yes, I was thinking about this also.
>

Good we're on the same page.

>> AFAICS for v19, the "future commit" for all three patches (0006-0008) is
>> 0012, which introduces the unified iterator. Is that correct?
>
> Actually, those patches (v19 0006-0008) were required for v19 0009,
> which is why I put them directly before it. 0009 eliminates use of the
> TBMIterateResult for control flow in BitmapHeapNext().
>

Ah, OK. Thanks for the clarification.

> I've rephrased the commit messages to not mention future commits and
> instead focus on what the changes in the commit are enabling.
>
> v19-0006 actually squashed very easily with v19-0009 and is actually
> probably better that way. It is still easy to understand IMO.
>
> In v20, I've attached just the functionality from v19 0006-0009 but in
> three patches instead of four.
>

Good. I'll take a look today.

>> Also, for 0008 I'm not sure we could even split it between v17 and v18,
>> because even if heapam did not use the iterator, what if some other AM
>> uses it? Without 0012 it'd be a problem for the AM, no?
>
> The iterators in the TableScanDescData were introduced in v19-0009. It
> is okay for other AMs to use it. In fact, they will want to use it. It
> is still initialized and set up in BitmapHeapNext(). They would just
> need to call tbm_iterate()/tbm_shared_iterate() on it.
>
> As for how table AMs will cope without the TBMIterateResult passed to
> table_scan_bitmap_next_tuple() (which is what v19 0008 did): they can
> save the location of the tuples to be scanned somewhere in their scan
> descriptor. Heap AM already did this and actually didn't use the
> TBMIterateResult at all.
>

The reason I feel a bit uneasy about putting this in TableScanDescData
is I see that struct as a "description of the scan" (~input parameters
defining what the scan should do), while for runtime state we have
ScanState in execnodes.h. But maybe I'm wrong - I see we have similar
runtime state in the other scan descriptors (like xs_itup/xs_hitup, kill
prior tuple in index scans etc.).

>> Would it make sense to move 0009 before these three patches? That seems
>> like a meaningful change on it's own, right?
>
> Since v19 0009 requires these patches, I don't think we could do that.
> I think up to and including 0009 would be an improvement in clarity and
> function.

Right, thanks for the correction.

>
> As for all the patches after 0009, I've dropped them from this version.
> We are out of time, and they need more thought.
>

+1

> After we decided not to pursue streaming bitmapheapscan for 17, I wanted
> to make sure we removed the prefetch code table AM violation -- since we
> weren't deleting that code. So what started out as me looking for a way
> to clean up one commit ended up becoming a much larger project. Sorry
> about that last minute code explosion! I do think there is a way to do
> it right and make it nice. Also that violation would be gone if we
> figure out how to get streaming bitmapheapscan behaving correctly.
>
> So, there's just more motivation to make streaming bitmapheapscan
> awesome for 18!
>
> Given all that, I've only included the three patches I think we are
> considering (former v19 0006-0008). They are largely the same as you saw
> them last except for squashing the two commits I mentioned above and
> updating all of the commit messages.
>

Right. I do think the patches make sensible changes in principle, but
the later parts need more refinement so let's not rush it.

>> FWIW I don't think it's very likely I'll commit the UnifiedTBMIterator
>> stuff. I do agree with the idea in general, but I think I'd need more
>> time to think about the details. Sorry about that ...
>
> Yes, that makes total sense. I 100% agree.
>
> I do think the UnifiedTBMIterator (maybe the name is not good, though)
> is a good way to simplify the BitmapHeapScan code and is applicable to
> any future TIDBitmap user with both a parallel and serial
> implementation. So, there's a nice, small patch I can register for July.
>
> Thanks again for taking time to work on this!
>

Yeah, The name seems a bit awkward ... but I don't have a better idea. I
like how it "isolates" the complexity and makes the BHS code simpler and
easier to understand.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2024-04-07 12:19:29 Re: LogwrtResult contended spinlock
Previous Message Tender Wang 2024-04-07 10:33:30 Re: Can't find not null constraint, but \d+ shows that