Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, benoit <benoit(at)hopsandfork(dot)com>
Subject: Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Date: 2023-07-27 14:00:31
Message-ID: CAH2-WznprRoyeK1FmxpV6wHZESrjimmvv1J1hTwKuvMbah8LMA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 27, 2023 at 7:59 AM Matthias van de Meent
<boekewurm+postgres(at)gmail(dot)com> wrote:
> > Basically, the patch that added that feature had to revise the index
> > AM API, in order to support a mode of operation where scans return
> > groupings rather than tuples. Whereas this patch requires none of
> > that. It makes affected index scans as similar as possible to
> > conventional index scans.
>
> Hmm, yes. I see now where my confusion started. You called it out in
> your first paragraph of the original mail, too, but that didn't help
> me then:
>
> The wiki does not distinguish "Index Skip Scans" and "Loose Index
> Scans", but these are not the same.

A lot of people (myself included) were confused on this point for
quite a while. To make matters even more confusing, one of the really
compelling cases for the MDAM design is scans that feed into
GroupAggregates -- preserving index sort order for naturally big index
scans will tend to enable it. One of my examples from the start of
this thread showed just that. (It just so happened that that example
was faster because of all the "skipping" that nbtree *wasn't* doing
with the patch.)

> Yes, that's why I asked: The MDAM paper's examples seem to materialize
> the full predicate up-front, which would require a product of all
> indexed columns' quals in size, so that materialization has a good
> chance to get really, really large. But if we're not doing that
> materialization upfront, then there is no issue with resource
> consumption (except CPU time, which can likely be improved with other
> methods)

I get why you asked. I might have asked the same question.

As I said, the MDAM paper has *surprisingly* little to say about
B-Tree executor stuff -- it's almost all just describing the
preprocessing/transformation process. It seems as if optimizations
like the one from my patch were considered too obvious to talk about
and/or out of scope by the authors. Thinking about the MDAM paper like
that was what made everything fall into place for me. Remember,
"missing key predicates" isn't all that special.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2023-07-27 14:05:52 Reducing memory consumed by RestrictInfo list translations in partitionwise join planning
Previous Message Matthias van de Meent 2023-07-27 13:59:51 Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan