Re: BTScanOpaqueData size slows down tests

From: Andres Freund <andres(at)anarazel(dot)de>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: BTScanOpaqueData size slows down tests
Date: 2025-04-02 16:08:35
Message-ID: 647aitqk7ssezrl4e6mk6lyz3xuzflbgqimmrfsvjinvuxi6sw@7p6emyg3fqrj
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2025-04-02 12:01:57 -0400, Peter Geoghegan wrote:
> On Wed, Apr 2, 2025 at 11:57 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > I'd assume it's extremely rare for there to be this many items on a page. I'd
> > guess that something like storing having BTScanPosData->items point to an
> > in-line 4-16 BTScanPosItem items_inline[N] and dynamically allocate a
> > full-length BTScanPosItem[MaxTIDsPerBTreePage] just in the cases it's needed.
>
> There can still only be MaxIndexTuplesPerPage items on the page (407
> if memory serves) -- deduplication didn't change that.

Sure.

> It isn't at all rare for the scan to have to return about 1350 TIDs
> from a page, though. Any low cardinality index will tend to have
> almost that many TIDs to return on any page that only stores
> duplicates. And scan will necessarily have to return all of the TIDs
> from such a page, if it has to return any.

I'm not sure what you're arguing for/against here? Obviously we need to handle
that case. I doubt that the overhead of once-per-scan allocation of a
MaxTIDsPerBTreePage * sizeof(BTScanPosItem) array once per scan matters when
that many tuples are returned.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2025-04-02 16:09:50 Re: BTScanOpaqueData size slows down tests
Previous Message Andres Freund 2025-04-02 16:05:59 Re: index prefetching