From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | Peter Eisentraut <peter(at)eisentraut(dot)org>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Should heapam_estimate_rel_size consider fillfactor? |
Date: | 2023-07-03 17:54:19 |
Message-ID: | 2146d2f2-b445-3669-e231-4a2407f5bb9c@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 7/3/23 11:40, Tomas Vondra wrote:
> ...
>
> FWIW the reason why the integer division is intentional is most likely
> that we want "floor" semantics - if there's 10.23 rows per page, that
> really means 10 rows per page.
>
> I doubt it makes a huge difference in this particular place, considering
> we're calculating the estimate from somewhat unreliable values, and then
> use it for rough estimate of relation size.
>
> But from this POV, I think it's more correct to do it "my" way:
>
> density = (usable_bytes_per_page * fillfactor / 100) / tuple_width;
>
> because that's doing *two* separate integer divisions, with floor
> semantics. First we calculate "usable bytes" (rounded down), then
> average number of rows per page (also rounded down).
>
> Corey's formula would do just one integer division. I don't think it
> makes a huge difference, though. I mean, it's just an estimate and so we
> can't expect to be 100% accurate.
>
Pushed, using the formula with two divisions (as in the original patch).
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Tristan Partin | 2023-07-03 17:54:50 | Re: Optionally using a better backtrace library? |
Previous Message | Álvaro Herrera | 2023-07-03 17:46:27 | Re: Does a cancelled REINDEX CONCURRENTLY need to be messy? |