Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Lukas Fittl <lukas(at)fittl(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>, Andrei Lepikhov <lepihov(at)gmail(dot)com>
Subject: Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment
Date: 2025-04-15 00:23:11
Message-ID: CAApHDvpuM8jW-BGu7B=i6uv+jmndEo-2z_BYvj5JNmx=qV-uSQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 15 Apr 2025 at 11:14, Ilia Evdokimov
<ilya(dot)evdokimov(at)tantorlabs(dot)com> wrote:
> On 14.04.2025 23:53, David Rowley wrote:
> If we can't get consensus for everything people want to add at once
> then maybe the patch could be broken into two, with 0001 being pretty
> much the v4 patch and then have 0002 add the Estimated Hit Ratio.
> Having the physical patches and being able to try it out or view the
> regression test changes might lower the bar on people chipping in with
> their views.
>
>
> Agreed - I’ll split the patch into two: one adds Estimated Capacity and Estimated Distinct Keys, and the second adds Estimated Hit Ratio. I’ll also add regression test differences to make it easier to evaluate the usefulness of each part.

Thanks for the patches.

I'm just looking for ways to allow us to group all three of these
together for the TEXT format type. I see the BUFFERS are displayed
quite compactly, e.g. "Buffers: shared hit=17 read=4".
show_wal_usage() does something similar and so does
show_modifytable_info() for MERGE.

Maybe we could compress the text output a bit with:

"Estimates: capacity=N distinct keys=N hit ratio=N.N%"

If we get it that compact, maybe lookups could fit in there too, i.e.:

"Estimates: capacity=N distinct keys=N lookups=N hit ratio=N.N%"

I'd also like to vote that you modify explain.c and multiply the
hit_ratio by 100 and use "%" as the unit in ExplainPropertyFloat().
Just looking at the raw number of "1.00" in the expected output, it
isn't obvious if the planner expects every lookup to be a cache hit or
just 1% of them.

Also, to get something commitable, you'll also need to modify
explain_memoize() at the top of memoize.sql and add handling to mask
out the value of these new properties. These are not going to be
anywhere near stable enough across platforms to have these shown in
the expected output files.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chapman Flack 2025-04-15 00:32:28 Re: FmgrInfo allocation patterns (and PL handling as staged programming)
Previous Message Jacob Champion 2025-04-15 00:13:35 Re: [PoC] Federated Authn/z with OAUTHBEARER