Quick Links

Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans)

From:	Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To:	Peter Geoghegan <pg(at)bowt(dot)ie>
Cc:	PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans)
Date:	2024-08-27 23:22:20
Message-ID:	CAEze2WiaMgR2sx_GVzmzr8TkYo_Qya7pOFvtHi-9+hG=OCZ3Hw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, 27 Aug 2024 at 23:40, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Tue, Aug 27, 2024 at 5:03 PM Matthias van de Meent
> <boekewurm+postgres(at)gmail(dot)com> wrote:
> > If the counter was put into the BTScanOpaque, rather than the
> > IndexScanDesc, then this could trivially be used in an explain AM
> > callback, as IndexScanDesc and ->opaque are both still available while
> > building the explain output.
>
> Right, "trivial". Except in that it requires inventing a whole new
> general purpose infrastructure.

Which seems to be in the process of being invented already elsewhere.

> Meanwhile, Tom is arguing against even
> showing this very basic information in EXPLAIN ANALYZE.You see the
> problem?

I think Tom's main issue is additional clutter when running just plain
`explain analyze`, and he'd probably be fine with it if this was gated
behind e.g. VERBOSE or a new "get me the AM's view of this node"
-flag.

> > As a result, it wouldn't bloat the
> > IndexScanDesc for other index AMs who might not be interested in this
> > metric.
>
> Why do you persist with the idea that it isn't useful for other index
> AMs?

Because
- there are no other index AMs that would show a count that's
different from loops (Yes, I'm explicitly ignoring bitmapscan's synthetic SAOP)
- because there is already a place where this info is stored.

> I mean it literally works in exactly the same way! It's literally
> indistinguishable to users, and works in a way that's consistent with
> historical behavior/definitions.

Historically, no statistics/explain-only info is stored in the
IndexScanDesc, all data inside that struct is relevant even when
EXPLAIN was removed from the codebase. The same is true for
TableScanDesc
Now, you want to add this metadata to the struct. I'm quite hesitant
to start walking on such a surface, as it might just be a slippery
slope.

> > I don't want anything, or anything done about it, but your statement
> > that all index AMs support SAOP (potentially non-natively) is not
> > true, as the non-native SAOP support is only for bitmap index scans,
> > and index AMs aren't guaranteed to support bitmap index scans (e.g.
> > pgvector's IVFFLAT and HNSW are good examples, as they only support
> > amgettuple).
>
> Yes, there are some very minor exceptions -- index AMs where even
> non-native SAOPs won't be used. What difference does it make?

That not all index types (even: most index types) have no interesting
performance numbers indicated by the count.

> > > > And, in this case,
> > > > the use case seems quite index-specific, at least for IS/IOS nodes.
> > >
> > > I disagree. It's an existing concept, exposed in system views, and now
> > > in EXPLAIN ANALYZE. It's precisely that -- nothing more, nothing less.
> >
> > To be precise, it is not precisely that, because it's a different
> > counter that an AM must update when the pgstat data is updated if it
> > wants the explain output to reflect the stats counter accurately.
>
> Why does that matter?

Because to me it seels like one more thing an existing index AM's
author needs to needlessly add to its index.

> > When an AM forgets to update one of these metrics (or fails to realize they
> > have to both be updated) then they'd be out-of-sync. I'd prefer if an
> > AM didn't have to account for it's statistics in more than one place.
>
> I could easily change the pgstat_count_index_scan macro so that index
> AMs were forced to do both, or neither. (Not that this is a real
> problem.)

That'd be one way to reduce the chances of accidental bugs, which
seems like a good start.

> > > > Can't you pull that data, instead of inventing a new place
> > > > every AMs needs to touch for it's metrics?
> > >
> > > No. At least not in a way that's scoped to a particular index scan.
> >
> > Similar per-node counter data is pulled for the global (!) counters of
> > pgBufferUsage, so why would it be less possible to gather such metrics
> > for just one index's stats here?
>
> I told you why already, when we talked about this privately: there is
> no guarantee that it's the index indicated by the scan
> instrumentation.

For the pgstat entry in rel->pgstat_info, it is _exactly_ guaranteed
to be the index of the IndexScan node. pgBufferUsage happens to be
global, but pgstat_info is gathered at the relation level.

> For example, due to syscache lookups.

Sure, if we're executing a query on catalogs looking at index's
numscans might count multiple index scans if the index scan needs to
access that same catalog table's data through that same catalog index,
but in those cases I think it's an acceptable count difference.

> There's also
> the question of how we maintain the count for things like nestloop
> joins, where invocations of different index scan nodes may be freely
> woven together. So it just won't work.

Gathering usage counters on interleaving execution nodes has been done
for pgBufferUsage, so I don't see how it just won't work. To me, it
seems very realistically possible.

> Besides, I thought that you wanted me to use some new field in
> BTScanOpaque? But now you want me to use a global counter. Which is
> it?

If you think it's important to have this info on all indexes then I'd
prefer the pgstat approach over adding a field in IndexScanDescData.
If instead you think that this is primarily important to expose for
nbtree index scans, then I'd prefer putting it in the BTSO using e.g.
the index AM analyze hook approach, as I think that's much more
elegant than this.

> > While I do think it won't be easy to
> > find a good way to integrate this into EXPLAIN's Instrumentation, I
> > imagine other systems (e.g. table scans) may benefit from a better
> > integration and explanation of pgstat statistics in EXPLAIN, too. E.g.
> > I'd love to be able to explain how many times which function was
> > called in a plans' projections, and what the relevant time expendature
> > for those functions is in my plans.
>
> Seems completely unrelated.

I'd call "exposing function's pgstat data in explain" at least
somewhat related to "exposing indexes' pgstat data in explain".

> > Alternatively, you could update the patch so that only the field in
> > IndexScan would need to be updated by the index AM by making the
> > executor responsible to update the relation's stats at once at the end
> > of the query with the data from the IndexScanDesc.
>
> I don't understand why this is an alternative to the other thing that
> you said. Or even why it's desirable.

I think it would be preferred over requiring Index AMs to maintain 2
fields in 2 very different locations but in the same way with the same
update pattern. With the mentioned change, they'd only have to keep
the ISD's numscans updated with rescans (or, _bt_first/_bt_search's).
Your alternative approach of making pgstat_count_index_scan update
both would probably have the same desired effect of requiring the AM
author to only mind this one entry point for counting index scan
stats.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

In response to

Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans) at 2024-08-27 21:40:20 from Peter Geoghegan

Responses

Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans) at 2024-08-27 23:41:43 from Peter Geoghegan
Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans) at 2024-08-28 13:35:33 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2024-08-27 23:37:56	Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.
Previous Message	Peter Geoghegan	2024-08-27 23:17:45	Re: PoC: prefetching data between executor nodes (e.g. nestloop + indexscan)