Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans)

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans)
Date: 2024-08-27 18:13:54
Message-ID: CAH2-WzmOZ9EM9hHbvQu7_X6AnqaCO9+mVteA_AMHPYJ1XwGs7A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 27, 2024 at 1:45 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I do like "Index Searches" better than "Primitive Index Scans."
>
> But I think Matthias had some good points about this being
> btree-specific.

It's not B-Tree specific -- not really. Any index scan that can at
least non-natively support ScalarArrayOps (i.e. SAOP scans that the
executor manages using ExecIndexEvalArrayKeys() + bitmap scans) will
show information that is exactly equivalent to what B-Tree will show,
given a similar ScalarArrayOps query.

There is at best one limited sense in which the information shown is
B-Tree specific: it tends to be more interesting in the case of B-Tree
index scans. You cannot trivially derive the number based on the
number of array keys for B-Tree scans, since nbtree is now clever
about not needlessly searching the index anew. It's quite possible
that other index AMs will in the future be enhanced in about the same
way as nbtree was in commit 5bf748b86b, at which point even this will
no longer apply. (Tom speculated about adding something like that to
GiST recently).

> I'm not sure whether he was completely correct, but
> you seemed to just dismiss his argument and say "well, that can't be
> done," which doesn't seem convincing to me at all.

To be clear, any variation that you can think of *can* be done without
much difficulty. I thought that Matthias was unclear about what he
even wanted, is all.

The problem isn't that there aren't any alternatives. The problem, if
any, is that there are a huge number of slightly different
alternatives. There are hopelessly subjective questions about what the
best trade-off between redundancy and consistency is. I'm absolutely
not set on doing things in exactly the way I've laid out.

What do you think should be done? Note that the number of loops
matters here, in addition to the number of SAOP primitive
scans/searches. If you want to suppress the information shown in the
typical "nsearches == 1" case, what does that mean for the less common
"nsearches == 0" case?

> If, for non-btree
> indexes, the number of index searches will always be the same as the
> loop count, then surely there is some way to avoid cluttering the
> output for non-btree indexes with output that can never be of any use.

Even if we assume that a given index/index AM will never use SAOPs,
it's still possible to show more than one "Index Search" per executor
node execution. For example, when an index scan node is the inner side
of a nestloop join.

I see value in making it obvious to users when and how
pg_stat_all_indexes.idx_scan advances. Being able to easily relate it
to EXPLAIN ANALYZE output is useful, independent of whether or not
SAOPs happen to be used. That's probably the single best argument in
favor of showing "Index Searches: N" unconditionally. But I'm
certainly not going to refuse to budge over that.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Koval 2024-08-27 18:24:35 Re: Add SPLIT PARTITION/MERGE PARTITIONS commands
Previous Message Ayush Vatsa 2024-08-27 17:47:00 Re: Proposal to have INCLUDE/EXCLUDE options for altering option values