From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> |
Subject: | Re: Confine vacuum skip logic to lazy_scan_skip |
Date: | 2024-01-04 20:23:09 |
Message-ID: | 20240104202309.77h5llrambkl5a3m@awork3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2024-01-02 12:36:18 -0500, Melanie Plageman wrote:
> Subject: [PATCH v2 1/6] lazy_scan_skip remove unnecessary local var rel_pages
> Subject: [PATCH v2 2/6] lazy_scan_skip remove unneeded local var
> nskippable_blocks
I think these may lead to worse code - the compiler has to reload
vacrel->rel_pages/next_unskippable_block for every loop iteration, because it
can't guarantee that they're not changed within one of the external functions
called in the loop body.
> Subject: [PATCH v2 3/6] Add lazy_scan_skip unskippable state
>
> Future commits will remove all skipping logic from lazy_scan_heap() and
> confine it to lazy_scan_skip(). To make those commits more clear, first
> introduce the struct, VacSkipState, which will maintain the variables
> needed to skip ranges less than SKIP_PAGES_THRESHOLD.
Why not add this to LVRelState, possibly as a struct embedded within it?
> From 335faad5948b2bec3b83c2db809bb9161d373dcb Mon Sep 17 00:00:00 2001
> From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
> Date: Sat, 30 Dec 2023 16:59:27 -0500
> Subject: [PATCH v2 4/6] Confine vacuum skip logic to lazy_scan_skip
>
> In preparation for vacuum to use the streaming read interface (and eventually
> AIO), refactor vacuum's logic for skipping blocks such that it is entirely
> confined to lazy_scan_skip(). This turns lazy_scan_skip() and the VacSkipState
> it uses into an iterator which yields blocks to lazy_scan_heap(). Such a
> structure is conducive to an async interface.
And it's cleaner - I find the current code extremely hard to reason about.
> By always calling lazy_scan_skip() -- instead of only when we have reached the
> next unskippable block, we no longer need the skipping_current_range variable.
> lazy_scan_heap() no longer needs to manage the skipped range -- checking if we
> reached the end in order to then call lazy_scan_skip(). And lazy_scan_skip()
> can derive the visibility status of a block from whether or not we are in a
> skippable range -- that is, whether or not the next_block is equal to the next
> unskippable block.
I wonder if it should be renamed as part of this - the name is somewhat
confusing now (and perhaps before)? lazy_scan_get_next_block() or such?
> + while (true)
> {
> Buffer buf;
> Page page;
> - bool all_visible_according_to_vm;
> LVPagePruneState prunestate;
>
> - if (blkno == vacskip.next_unskippable_block)
> - {
> - /*
> - * Can't skip this page safely. Must scan the page. But
> - * determine the next skippable range after the page first.
> - */
> - all_visible_according_to_vm = vacskip.next_unskippable_allvis;
> - lazy_scan_skip(vacrel, &vacskip, blkno + 1);
> -
> - Assert(vacskip.next_unskippable_block >= blkno + 1);
> - }
> - else
> - {
> - /* Last page always scanned (may need to set nonempty_pages) */
> - Assert(blkno < rel_pages - 1);
> -
> - if (vacskip.skipping_current_range)
> - continue;
> + blkno = lazy_scan_skip(vacrel, &vacskip, blkno + 1,
> + &all_visible_according_to_vm);
>
> - /* Current range is too small to skip -- just scan the page */
> - all_visible_according_to_vm = true;
> - }
> + if (blkno == InvalidBlockNumber)
> + break;
>
> vacrel->scanned_pages++;
>
I don't like that we still do determination about the next block outside of
lazy_scan_skip() and have duplicated exit conditions between lazy_scan_skip()
and lazy_scan_heap().
I'd probably change the interface to something like
while (lazy_scan_get_next_block(vacrel, &blkno))
{
...
}
> From b6603e35147c4bbe3337280222e6243524b0110e Mon Sep 17 00:00:00 2001
> From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
> Date: Sun, 31 Dec 2023 09:47:18 -0500
> Subject: [PATCH v2 5/6] VacSkipState saves reference to LVRelState
>
> The streaming read interface can only give pgsr_next callbacks access to
> two pieces of private data. As such, move a reference to the LVRelState
> into the VacSkipState.
>
> This is a separate commit (as opposed to as part of the commit
> introducing VacSkipState) because it is required for using the streaming
> read interface but not a natural change on its own. VacSkipState is per
> block and the LVRelState is referenced for the whole relation vacuum.
I'd do it the other way round, i.e. either embed VacSkipState ino LVRelState
or point to it from VacSkipState.
LVRelState is already tied to the iteration state, so I don't think there's a
reason not to do so.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2024-01-04 21:02:07 | Re: PL/pgSQL: Incomplete item Allow handling of %TYPE arrays, e.g. tab.col%TYPE[] |
Previous Message | Andres Freund | 2024-01-04 20:03:31 | Re: Emit fewer vacuum records by reaping removable tuples during pruning |