From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
Subject: | Adjust Memoize hit_ratio calculation |
Date: | 2023-03-20 20:41:36 |
Message-ID: | CAApHDvrV44LwiF4W_qf_RpbGYWSgp1kF=cZr+kTRRaALUfmXqw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Yesterday, in 785f70957, I adjusted the Memoize costing code to
account for the size of the cache key when estimating how many cache
entries can exist at once in the cache. That effectively makes
Memoize a less likely choice as fewer entries will be expected to fit
in work_mem now.
Because that's being changed in v16, I think it might also be a good
idea to fix the hit_ratio calculation problem reported by David
Johnston in [1]. In the attached, I've adjusted David's calculation
slightly so that we divide by Max(ndistinct, est_cache_entries)
instead of ndistinct. This saves from overestimating when ndistinct
is smaller than est_cache_entries. I'd rather fix this now for v16
than wait until v17 and further adjust the Memoize costing.
I've attached a spreadsheet showing the new and old hit_ration
calculations. Cells C1 - C3 can be adjusted to show what the hit ratio
is for both the old and new method.
Any objections?
David
[1] https://postgr.es/m/CAKFQuwZEmcNk3YQo2Xj4EDUOdY6qakad31rOD1Vc4q1_s68-Ew@mail.gmail.com
Attachment | Content-Type | Size |
---|---|---|
adjust_memoize_hit_ratio_calculation.patch | application/octet-stream | 837 bytes |
memoize_cache_hits.ods | application/vnd.oasis.opendocument.spreadsheet | 12.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2023-03-20 20:48:58 | Re: Add LZ4 compression in pg_dump |
Previous Message | Greg Stark | 2023-03-20 20:35:38 | Re: Experiments with Postgres and SSL |