From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Should heapam_estimate_rel_size consider fillfactor? |
Date: | 2023-06-11 12:41:27 |
Message-ID: | cf154ef9-6bac-d268-b735-67a3443debba@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
While testing some stuff, I noticed heapam_estimate_rel_size (or rather
table_block_relation_estimate_size it calls) ignores fillfactor, so that
for a table without statistics it ends up with reltuple estimate much
higher than reality. For example, with fillfactor=10 the estimate is
about 10x higher.
I ran into this while doing some tests with hash indexes, where I use
fillfactor to make the table look bigger (as if the tuples were wider),
and I ran into this:
drop table hash_test;
create table hash_test (a int, b text) with (fillfactor=10);
insert into hash_test select 1 + ((i - 1) / 10000), md5(i::text)
from generate_series(1, 1000000) s(i);
-- analyze hash_test;
create index hash_test_idx on hash_test using hash (a);
select pg_size_pretty(pg_relation_size('hash_test_idx'));
If you run it like this (without the analyze), the index will be 339MB.
With the analyze, it's 47MB.
This only happens if there are no relpages/reltuples statistics yet, in
which case table_block_relation_estimate_size estimates density from
tuple width etc.
So it seems the easiest "fix" is to do ANALYZE before creating the index
(and not after it, as I had in my scripts). But I wonder how many people
fail to realize this - it sure took me a while to realize the indexes
are too large and even longer what is causing it. I wouldn't be very
surprised if many people had bloated hash indexes after bulk loads.
So maybe we should make table_block_relation_estimate_size smarter to
also consider the fillfactor in the "no statistics" branch, per the
attached patch.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment | Content-Type | Size |
---|---|---|
relsize-fillfactor-fix.patch | text/x-patch | 1017 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Dunstan | 2023-06-11 14:58:41 | Re: Do we want a hashset type? |
Previous Message | Joel Jacobson | 2023-06-11 10:26:39 | Re: Do we want a hashset type? |