Re: Confine vacuum skip logic to lazy_scan_skip

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: vignesh C <vignesh21(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: Confine vacuum skip logic to lazy_scan_skip
Date: 2024-03-08 13:49:47
Message-ID: 3df2b582-dc1c-46b6-99b6-38eddd1b2784@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 08/03/2024 02:46, Melanie Plageman wrote:
> On Wed, Mar 06, 2024 at 10:00:23PM -0500, Melanie Plageman wrote:
>>> I feel heap_vac_scan_get_next_block() function could use some love. Maybe
>>> just some rewording of the comments, or maybe some other refactoring; not
>>> sure. But I'm pretty happy with the function signature and how it's called.
>
> I've cleaned up the comments on heap_vac_scan_next_block() in the first
> couple patches (not so much in the streaming read user). Let me know if
> it addresses your feelings or if I should look for other things I could
> change.

Thanks, that is better. I think I now finally understand how the
function works, and now I can see some more issues and refactoring
opportunities :-).

Looking at current lazy_scan_skip() code in 'master', one thing now
caught my eye (and it's the same with your patches):

> *next_unskippable_allvis = true;
> while (next_unskippable_block < rel_pages)
> {
> uint8 mapbits = visibilitymap_get_status(vacrel->rel,
> next_unskippable_block,
> vmbuffer);
>
> if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
> {
> Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
> *next_unskippable_allvis = false;
> break;
> }
>
> /*
> * Caller must scan the last page to determine whether it has tuples
> * (caller must have the opportunity to set vacrel->nonempty_pages).
> * This rule avoids having lazy_truncate_heap() take access-exclusive
> * lock on rel to attempt a truncation that fails anyway, just because
> * there are tuples on the last page (it is likely that there will be
> * tuples on other nearby pages as well, but those can be skipped).
> *
> * Implement this by always treating the last block as unsafe to skip.
> */
> if (next_unskippable_block == rel_pages - 1)
> break;
>
> /* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
> if (!vacrel->skipwithvm)
> {
> /* Caller shouldn't rely on all_visible_according_to_vm */
> *next_unskippable_allvis = false;
> break;
> }
>
> /*
> * Aggressive VACUUM caller can't skip pages just because they are
> * all-visible. They may still skip all-frozen pages, which can't
> * contain XIDs < OldestXmin (XIDs that aren't already frozen by now).
> */
> if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0)
> {
> if (vacrel->aggressive)
> break;
>
> /*
> * All-visible block is safe to skip in non-aggressive case. But
> * remember that the final range contains such a block for later.
> */
> skipsallvis = true;
> }
>
> /* XXX: is it OK to remove this? */
> vacuum_delay_point();
> next_unskippable_block++;
> nskippable_blocks++;
> }

Firstly, it seems silly to check DISABLE_PAGE_SKIPPING within the loop.
When DISABLE_PAGE_SKIPPING is set, we always return the next block and
set *next_unskippable_allvis = false regardless of the visibility map,
so why bother checking the visibility map at all?

Except at the very last block of the relation! If you look carefully,
at the last block we do return *next_unskippable_allvis = true, if the
VM says so, even if DISABLE_PAGE_SKIPPING is set. I think that's wrong.
Surely the intention was to pretend that none of the VM bits were set if
DISABLE_PAGE_SKIPPING is used, also for the last block.

This was changed in commit 980ae17310:

> @@ -1311,7 +1327,11 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
>
> /* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
> if (!vacrel->skipwithvm)
> + {
> + /* Caller shouldn't rely on all_visible_according_to_vm */
> + *next_unskippable_allvis = false;
> break;
> + }

Before that, *next_unskippable_allvis was set correctly according to the
VM, even when DISABLE_PAGE_SKIPPING was used. It's not clear to me why
that was changed. And I think setting it to 'true' would be a more
failsafe value than 'false'. When *next_unskippable_allvis is set to
true, the caller cannot rely on it because a concurrent modification
could immediately clear the VM bit. But because VACUUM is the only
process that sets VM bits, if it's set to false, the caller can assume
that it's still not set later on.

One consequence of that is that with DISABLE_PAGE_SKIPPING,
lazy_scan_heap() dirties all pages, even if there are no changes. The
attached test script demonstrates that.

ISTM we should revert the above hunk, and backpatch it to v16. I'm a
little wary because I don't understand why that change was made in the
first place, though. I think it was just an ill-advised attempt at
tidying up the code as part of the larger commit, but I'm not sure.
Peter, do you remember?

I wonder if we should give up trying to set all_visible_according_to_vm
correctly when we decide what to skip, and always do
"all_visible_according_to_vm = visibilitymap_get_status(...)" in
lazy_scan_prune(). It would be more expensive, but maybe it doesn't
matter in practice. It would get rid of this tricky bookkeeping in
heap_vac_scan_next_block().

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachment Content-Type Size
vactest.sql application/sql 844 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Matthias van de Meent 2024-03-08 14:00:14 Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Previous Message Daniel Gustafsson 2024-03-08 13:41:52 Re: Support a wildcard in backtrace_functions