Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Andrei Lepikhov <lepihov(at)gmail(dot)com>
Cc: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Lukas Fittl <lukas(at)fittl(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment
Date: 2025-03-23 21:16:18
Message-ID: CAApHDvoE5S5FkvEq+N3-J9LfaVUpWLOnczOYAOEvBMCY20=pdg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 21 Mar 2025 at 22:02, Andrei Lepikhov <lepihov(at)gmail(dot)com> wrote:
> In cases
> I have seen before, the main question is how effective was (or maybe) a
> Memoize node == how often the incoming key fits the cache. In that case,
> the hit ratio fraction is more understandable for a broad audience.
> That's why according to my experience in case of a good cache
> reusability factor, users are usually okay with increasing the cache
> size to the necessary numbers and avoiding evictions at all costs. So,
> the predicted evict_ratio also tells us about incrementing work_mem to
> enhance the chances of Memoisation.

Can you explain why "Estimated Capacity" and "Estimated Distinct
Lookup Keys" don't answer that? If there are more distinct lookup
keys than there is capacity to store them, then some will be evicted.

Once again, I'm not necessarily objecting to hit and evict ratios
being shown, I just want to know they're actually useful enough to
show and don't just bloat the EXPLAIN output needlessly. So far your
arguments aren't convincing me that they are.

> Having written the last sentence I came back to the point why work_mem
> is so universal and is used at each node as a criteria of memory
> allocation size? But it is a different story, I think.

We have to set the limit somehow. We could have done this by having a
GUC per node type that uses memory, but it looks like something more
universal was decided, perhaps to save on GUCs. I don't know the exact
history, but once upon a time, sort_mem existed. Perhaps that
disappeared because we grew more node types that needed to allocate
large, otherwise unbounded amounts of memory. We did more recently
grow a hash_mem_multiplier GUC, so it's not true to say that work_mem
solely controls the limits of each node's memory allocation sizes.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-03-23 21:20:52 Re: Add missing tab completion for VACUUM and ANALYZE with ONLY option
Previous Message Aidar Imamov 2025-03-23 19:16:38 Re: Add pg_buffercache_evict_all() and pg_buffercache_mark_dirty[_all]() functions