Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans)

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans)
Date: 2024-08-15 21:45:02
Message-ID: CAH2-Wz=t=gNKj-7in1q05ZnPSyaKnmMnD67=avSocPsG_mANuA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 15, 2024 at 4:58 PM Alena Rybakina
<a(dot)rybakina(at)postgrespro(dot)ru> wrote:
> I think that it is enough to pass the IndexScanDesc parameter to the function - this saves us from having to define the planstate type twice.
>
> For this reason, I suggest some changes that I think may improve your patch.

Perhaps it's a little better that way. I'll consider it.

> To be honest, I don't quite understand how information in explain analyze about the number of used primitive indexes
> will help me improve my database system as a user. Perhaps I'm missing something.

There is probably no typical case. The patch shows implementation
details, which need to be interpreted in the context of a particular
problem.

Maybe the problem is that some of the heuristics added by one of my
nbtree patches interact relatively badly with some real world query.
It would be presumptuous of me to say that that will never happen.

> Maybe it can tell me which columns are best to create an index on or something like that?

That's definitely going to be important in the case of skip scan.
Simply showing the user that the index scan skips at all will make
them aware that there are missing index columns. That could be a sign
that they'd be better off not using skip scan at all, by creating a
new index that suits the particular query (by not having the extra
skipped column).

It's almost always possible to beat skip scan by creating a new index
-- whether or not it's worth the trouble/expense of maintaining a
whole new index is the important question. Is this particular query
the most important query *to the business*, for whatever reason? Or is
having merely adequate performance acceptable?

Your OR-to-SAOP-rewrite patch effectively makes two or more bitmap
index scans into one single continuous index scan. Or...does it? The
true number of (primitive) index scans might be "the same" as it was
before (without your patch), or there might really only be one
(primitive) index scan with your patch. Or it might be anywhere in
between those two extremes. Users will benefit from knowing where on
this continuum a particular index scan falls. It's just useful to know
where time is spent.

Knowing this information might even allow the user to create a new
multicolumn index, with columns in an order better suited to an
affected query. It's not so much the cost of descending the index
multiple times that we need to worry about here, even though that's
what we're talking about counting here. Varying index column order
could make an index scan faster by increasing locality. Locality is
usually very important. Few index scans is a good proxy for greater
locality.

It's easiest to understand what I mean about locality with an example.
An index on (a, b) is good for queries with quals such as "where a =
42 and b in (1,2,3,4,5,6,7,8,9)" if it allows such a query to only
access one or two leaf pages, where all of the "b" values of interest
live side by side. Obviously that won't be true if it's the other way
around -- if the typical qual looks more like "where b = 7 and a in
(1,2,3,4,5,6,7,8,9)". This is the difference between 1 primitive
index scan and 9 primitive index scans -- quite a big difference. Note
that the main cost we need to worry about here *isn't* the cost of
descending the index. It's mostly the cost of reading the leaf pages.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Matthias van de Meent 2024-08-15 21:46:51 Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans)
Previous Message Nathan Bossart 2024-08-15 21:34:30 Re: Remove dependence on integer wrapping