From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
---|---|
To: | Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: vacuum -vs reltuples on insert only index |
Date: | 2020-11-02 20:06:17 |
Message-ID: | CAH2-Wzk+8kQ-ZoxoeOBXymStt5SaXZF8RncOB7jP0sZ31WJ8Aw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Nov 2, 2020 at 10:03 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> Attached is my proposed fix, which takes this approach. I will commit
> this on Wednesday or Thursday, barring any objections.
Just to be clear: I am not proposing that we set
'IndexBulkDeleteResult.estimated_count = false' here, even though
there is a certain sense in which we now accept an unreliable figure
in Postgres 13. This is not what GIN does. That approach doesn't seem
appropriate for nbtree + deduplication, which is much closer to nbtree
in Postgres 12 than to GIN. I believe that the final num_index_tuples
value (generated during cleanup-only nbtree VACUUM) is in general
sufficiently reliable to not be treated as an estimate by vacuumlazy.c
-- the pg_class entry for the index should still be updated in
update_index_statistics().
In other words, I think that the remaining posting-list related
inaccuracies are comparable to the existing inaccuracies caused by
concurrent page splits during nbtree vacuuming (I describe the problem
right next to an old comment about that issue, in fact). What we have
in both cases is an artifact of how the data is physically represented
and the difficulty it causes us during vacuuming, in certain cases.
There are known error bars. That's why we shouldn't treat
num_index_tuples as merely an estimate.
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2020-11-02 20:19:58 | Re: vacuum -vs reltuples on insert only index |
Previous Message | Andres Freund | 2020-11-02 19:50:59 | Re: libpq compression |