Re: Adding skip scan (including MDAM style range skip scan) to nbtree

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Masahiro Ikeda <ikedamsh(at)oss(dot)nttdata(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, Masahiro(dot)Ikeda(at)nttdata(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, Masao(dot)Fujii(at)nttdata(dot)com
Subject: Re: Adding skip scan (including MDAM style range skip scan) to nbtree
Date: 2025-04-02 23:32:24
Message-ID: CAH2-Wznyiub+1mdf_bk+GqMaYV25v4jpUv=FhAUk6wApP37uTg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 1, 2025 at 3:08 PM Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
> I think it would be useful to show information that we used an index scan but at the same time we skipped the "region" column and I assume we should output how many distinct values the "region" column had.
>
> For example it will look like this "Skip Scan on region (4 distinct values)":
>
> What do you think?

I don't see much value in that. We can sometimes have data skew that
makes the number of distinct values far from representative of how
many index searches were required. We can have 3 distinct prefix
column values within 90% of all leaf pages, while the remaining 10%
all have unique values. Skip scan will work quite well here (at least
compared to a traditional full index scan), but the number of distinct
values makes it look really bad.

> I see that this work is really voluminous and yes, I agree with you that optimization for skipping index scanning
> in terms of displaying information about changing quals, if any, can even be done using Oracle as an example.
> For example, if you introduce a new range ANY(<every possible 'a' value>) due to skipping the first column
> in the index key, it will be useful for users to know. Or if you define a new range that the skip array will consider
> during the index scan. Well, and other things related to this, but not specifically with your patch, for example
> in the case of conflicting conditions or defining boundaries in the case of index scanning, like, when a>1 and a>10
> we need to scan only a>10.
>
> This optimization was also introduced by you earlier even before the patch on skip optimization, but I think it also lacks
> some output in the explain.

I would also like this kind of stuff to appear in EXPLAIN. It's partly
hard because so much stuff happens outside of planning.

> * The availability of opclass skipsupport, which makes skip arrays
> generate their element values by addition/subtraction from the current
> array element, rather than using NEXT/PRIOR sentinel keys.
>
> The sentinel keys act as probes that get the next real (non-sentinel)
> value that we need to look up next. Whereas skip support can often
> successfully guess that (for example) the next value in the index
> after 268 is 269, saving a primitive scan each time (this might not
> happen at all, or it might work only some of the time, or it might
> work all of the time).
>
> To be honest, I missed your point here. If possible, could you explain it in more detail?

So, many types don't (and probably can't) offer skip support. Skip
scan still works there. The most common example of this is "text".

We're still using skip arrays when skipping using (say) a text column.
Conceptually, it's still "WHERE a = ANY(<every possible 'a' value>)",
even with these continuous data types. It is useful to "pretend that
we're using a discrete data type", so that everything can work in the
usual way (remember, I hate special cases). We need to invent another
way to "increment" a text datum, that works in the same way (but
doesn't really require understanding the semantics of text, or
whatever the data type may be).

See my explanation about this here:

https://postgr.es/m/CAH2-WznKyHq_W7heu87z80EHyZepQeWbGuAZebcxZHvOXWCU-w@mail.gmail.com

See the part of my email that begins with "I think that you're
probably still a bit confused because the terminology in this area is
a little confusing. There are two ways of explaining the situation
with types like text and numeric (types that lack skip support)...."

> I agree with you, this is an improvement. "Index Searches: N" shows the number of individual index searches, but
> it is still not clear enough. Here you can additionally determine what happened based on the information about
> the number of scanned pages, but with large amounts of data this is difficult.

The benefit of using skip scan comes from all of the pages that we
*aren't* reading, that we'd usually have to read. Hard to show that.

> By the way, are you planning to commit additional output to the explain about skipped pages? I think, together with
> the previous ones, the information about index scanning would be sufficient for analysis.

Not for Postgres 18.

> Although I am still learning to understand correctly this information in the explain.
> By the way, I have long wanted to ask, maybe you can advise something else to read on this topic?
> Maybe not in this thread, so as not to overload this.

Let's talk about it off list.

> I think it is worth adding "skip scan" information, without it it is difficult in my opinion to evaluate
> whether this index is effective in comparison with another, looking only at the information on
> scanned blocks or Index search or I missed something?

I think that it's particularly worth adding something to EXPLAIN
ANALYZE that makes it obvious that the index in question might not be
what the user thinks that it is. It might be an index that does some
skipping, but is far from optimal. They might have simply overlooked
the fact that there is an "extra" column between the columns that
their query predicate actually uses, which is far slower than what is
possible with a separate index that also omits that column.

Basically, I think that the most important goal should be to try to
help the user to understand when they have completely the wrong idea
about the index. I think that it's much less important to help users
to understand exactly how well a skip scan performs, relative to some
theoretical ideal. The theoretical ideal is just too complicated.

> I'm not sure that I correctly understood about "separation optimizer index path/executor stage". Do you mean that it's better to do all the optimizations during index execution rather than during index execution?

Yes. In general it's good if we can delay decisions about the scan's
behavior until runtime, when we have a more complete picture of what
the index actually looks like.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2025-04-02 23:36:58 Re: Statistics Import and Export
Previous Message Andres Freund 2025-04-02 23:26:08 Re: Question -- why is there no errhint_internal function?