Quick Links

Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Lukas Fittl <lukas(at)fittl(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Andrei Lepikhov <lepihov(at)gmail(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>
Subject:	Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment
Date:	2025-03-26 19:37:34
Message-ID:	CA+TgmoZ8qXiZmmn4P9Mk1cf2mjMMLFPOjSasCjuKSiHFcm-ncw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, Mar 24, 2025 at 6:46 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> As I said, I'm not necessarily averse to showing these numbers
> somehow. But I don't think they belong in the default output,
> and I'm not even convinced that VERBOSE is the right place.
> pg_overexplain seems like it could be an ideal home for this
> sort of detail.

I think we're going to be sad if we start shunting things that
non-developers need into pg_overexplain. On the one hand, PostgreSQL
consultants will be annoyed because they'll have to get a contrib
module installed in order to be able to do their jobs, and I don't
think that should be a requirement. On the other hand, PostgreSQL
developers will also be annoyed because once the consultants start
using it they'll complain when we change things, and I think we want
to have the freedom to change things in pg_overexplain. For that
reason, I think that if we choose to display anything here, it should
either be displayed all the time or gated by some in-core option such
as VERBOSE.

I do acknowledge the argument that we don't show details of how costs
are derived in other cases. While I think that point has some
validity, the flip side is that I spend a fairly significant amount of
time attempting to reverse-engineer what the planner did from the
EXPLAIN output, and I find that pretty unenjoyable. The recent change
to show two decimal places on row-count estimation is one that comes
up a heck of a lot, and several people have thanked me for getting
that patch committed because that problem affected them, too. But it's
surely not the only example of a case where it's hard to determine
what happened in the planner from what shows up in EXPLAIN output, and
I think that trying to find ways to improve on that situation is
worthwhile.

I also don't think that we should be too concerned about bloating the
EXPLAIN output in the context of a patch that only affects Memoize.
Memoize nodes are not incredibly common in the query plans that I see,
so even if we added another line or three to the output for each one,
I don't think that would create major problems. On the other hand,
maybe there's an argument that what this patch touches is only the tip
of the iceberg, and that we're eventually going to want the same kinds
of things for Nested Loop and Hash Joins and Merge Joins, and maybe
even more detail that can be displayed in say 3 lines. In that case,
there's a double concern. On the one hand, that really would make the
output a whole lot more verbose, and on the other hand, it might
generate a fair amount of work to maintain it across future planner
changes. I can see deciding to reject changes of that sort on the
grounds that we're not prepared to maintain it, or deciding to gate it
behind a new option on the grounds that it is so verbose that even
people who say EXPLAIN VERBOSE are going to be sad if they get all
that crap by default. I'm not saying that we SHOULD make those
decisions -- I think exposing more detail here could be pretty useful
to people trying to solve query plan problems, including me, so I hope
we don't just kick that idea straight to the curb without due thought
-- but I would understand them.

The part I'm least sure about with respect to the proposed patch is
the actual stuff being displayed. I don't have the experience to know
whether it's useful for tracking down issues. If it's not, then I
agree we shouldn't display it. If it is, then I'm tentatively in favor
of showing it in standard EXPLAIN, possibly only with VERBOSE, with
the caveats from the previous paragraph: if more-common node types are
also going to have a bunch of stuff like this, then we need to think
more carefully. If Memoize is exceptional in needing additional
information displayed, then I think it's fine.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment at 2025-03-24 22:45:54 from Tom Lane

Responses

Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment at 2025-03-27 10:12:38 from Ilia Evdokimov

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Kirill Reshke	2025-03-26 19:48:54	Re: sync_standbys_defined read/write race on startup
Previous Message	Andrei Borodin	2025-03-26 19:32:36	Re: UUID v7