Re: BitmapHeapScan streaming read user and prelim refactoring

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: BitmapHeapScan streaming read user and prelim refactoring
Date: 2025-02-13 16:28:53
Message-ID: 8013b7cd-fd8a-4293-8deb-18e0508513ed@vondra.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2/13/25 17:01, Melanie Plageman wrote:
> On Thu, Feb 13, 2025 at 10:46 AM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>>
>> I reviewed v29 today, and I think it's pretty much ready to go.
>>
>> The one part where I don't quite get is 0001, which replaces a
>> FLEXIBLE_ARRAY_MEMBER array with a fixed-length array. It's not wrong,
>> but I don't quite see the benefits / clarity. And I think Thomas might
>> be right we may want to make this dynamic, to save memory.
>>
>> Not a blocker, but I'd probably skip 0001 (unless it's required by the
>> later parts, I haven't checked/tried).
>
> So, on master, it already pallocs an array of size MAX_TUPLES_PER_PAGE
> (which is hard-coded in the tidbitmap API to MaxHeapTuplesPerPage) --
> see tbm_begin_private_iterate().
>
> So we always palloc the same amount of memory. The reason I changed it
> from a flexible sized array to a fixed size is that we weren't using
> the flexibility and having a flexible sized array in the
> TBMIterateResult meant it couldn't be nested in another struct. Since
> I have to separate the TBMIterateResult and TBMIterator to implement
> the read stream API for BHS, once I separate them, I nest the
> TBMIterateResult in the GinScanEntry. If the array of offsets is
> flexible sized, then I would have to manage that memory separately in
> GIN code for the TBMIterateResult..
>
> So, 0001 isn't a change in the amount of memory allocated.
>
> With the read stream API, these TBMIterateResults are palloc'd just
> like we palloc'd one in master. However, we have to have more than one
> at a time.
>

I know it's not changing how much memory we allocate (compared to
master). I haven't thought about the GinScanEntry - yes, flexible array
member would make this a bit more complex.

I don't want to bikeshed this - I'll leave the decision about 0001 to
you. I'm OK with both options.

regards

--
Tomas Vondra

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sébastien 2025-02-13 16:37:46 Re: [Feature Request] INSERT FROZEN to Optimize Large Cold Data Imports and Migrations
Previous Message Greg Sabino Mullane 2025-02-13 16:14:01 Re: [Feature Request] INSERT FROZEN to Optimize Large Cold Data Imports and Migrations