From: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
---|---|
To: | Konstantin Knizhnik <knizhnik(at)garret(dot)ru> |
Cc: | Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: New criteria for autovacuum |
Date: | 2025-04-06 19:19:34 |
Message-ID: | CAAKRu_Z100bhmwWP+WM+rutfL0KMzWuOTPCUf=Lqx8QEKgKiZw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Apr 5, 2025 at 2:02 AM Konstantin Knizhnik <knizhnik(at)garret(dot)ru> wrote:
>
> A more targeted solution to your specific problem would be to update
> the visibility map on access. Then, the first time you have to fetch
> that heap page, you could mark it all-visible (assuming the long
> running transaction has ended) and then the next index-only scan
> wouldn't have to do the same heap fetch. It doesn't add any overhead
> in the case that the long running transaction has not ended, unlike
> trying to trigger another autovacuum.
>
> I really considered this alternative when thinking about the solution of the problem. It is more consistent with hint bit approach.
> I declined it in favor of this solution because of the following reasons:
>
> 1. Index-only scan holds read-only lock on heap page. In order to update it, we need to upgrade this lock to exclusive.
> 2. We need to check visibility for all elements on the page (actually do something like `heap_page_is_all_visible`) but if there is large number elements at the page it can be quite expensive. And I afraid that it can slowdown speed of index-only scan. Yes, only in "slow case" - when it has to access heap to perform visibility check. But still it may be not acceptable. Also it is not clear how to mark page as already checked. Otherwise we will have to repeat this check for all tids referring this page.
> 3. `heap_page_is_all_visible` is local to lazyvaccum.c. So to use it in index-only scan we either have to make it global, either cut&paste it's code. Just removing "static" is not possible, because it is using local `LVRelState`, so some refactoring is needed in any case.
> 4. We need to wal-log VM page and heap pages in case of setting all-visible bit. It is quite expensive operation. Doing it inside index-only scan can significantly increase time of select. Certainly Postgres is not a real-time DBMS. But still it is better to provide some predictable query execution time. This is why I think that it is better to do such workt in background (in vaccum).
I wasn't thinking about adding a new VM setting functionality to index
only scan in particular. heapam_index_fetch_tuple() already calls
heap_page_prune_opt() which will do pruning under certain conditions.
I was thinking that we start updating the VM after pruning in the
on-access case too (not just when pruning is invoked by vacuum).
If you look at the callers of heap_page_prune_opt(), it includes
bitmap heap scan and also heap_prepare_pagescan() which is invoked as
part of sequential scans and other operations.
- Melanie
From | Date | Subject | |
---|---|---|---|
Next Message | Chapman Flack | 2025-04-06 19:39:43 | Re: FmgrInfo allocation patterns (and PL handling as staged programming) |
Previous Message | Melanie Plageman | 2025-04-06 18:51:10 | Re: Parallel heap vacuum |