From: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Peter Geoghegan <pg(at)bowt(dot)ie>, Michael Paquier <michael(at)paquier(dot)xyz>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de> |
Subject: | Re: Emit fewer vacuum records by reaping removable tuples during pruning |
Date: | 2024-01-17 20:11:55 |
Message-ID: | CAAKRu_bsHeC5i=M2ACYJAUXZQ_uCigqApDzyWQpTxGL6XrkzyA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Jan 17, 2024 at 12:17 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Tue, Jan 16, 2024 at 6:07 PM Melanie Plageman
> <melanieplageman(at)gmail(dot)com> wrote:
> > Attached v8 patch set is rebased over this.
>
> Reviewing 0001, consider the case where a table has no indexes.
> Pre-patch, PageTruncateLinePointerArray() will happen when
> lazy_vacuum_heap_page() is called; post-patch, it will not occur.
> That's a loss. Should we care? On the plus side, visibility map
> repair, if required, can now take place. That's a gain.
I thought that this wasn't an issue because heap_page_prune_execute()
calls PageRepairFragmentation() which similarly modifies pd_lower and
sets the hint bit about free line pointers.
> I'm otherwise satisfied with this patch now, except for some extremely
> minor nitpicking:
>
> + * For now, pass mark_unused_now == false regardless of whether or
>
> Personally, i would write "pass mark_unused_now as false" here,
> because we're not testing equality. Or else "pass mark_unused_now =
> false". This is not an equality test.
>
> + * During pruning, the caller may have passed mark_unused_now == true,
>
> Again here, but also, this is referring to the name of a parameter to
> a function whose name is not given. I think this this should either
> talk fully in terms of code ("When heap_page_prune was called,
> mark_unused_now may have been passed as true, which allows would-be
> LP_DEAD items to be made LP_USED instead.") or fully in English
> ("During pruning, we may have decided to mark would-be dead items as
> unused.").
Fixed both of the above issues as suggested in attached v9
> > In 0004, I've taken the approach you seem to favor and combined the FSM
> > updates from the prune and no prune cases in lazy_scan_heap() instead
> > of pushing the FSM updates into lazy_scan_prune() and
> > lazy_scan_noprune().
>
> I do like that approach.
>
> I think do_prune can be declared one scope inward, in the per-block
> for loop. I would probably initialize it to true so I could drop the
> stubby else block:
>
> + /* If we do get a cleanup lock, we will definitely prune */
> + else
> + do_prune = true;
>
> And then I'd probably write the call as if (!lazy_scan_noprune())
> doprune = true.
>
> If I wanted to stick with not initializing do_prune, I'd put the
> comment inside as /* We got the cleanup lock, so we will definitely
> prune */ and add braces since that makes it a two-line block.
If we don't unconditionally set do_prune using the result of
lazy_scan_noprune(), then we cannot leave do_prune uninitialized. I
preferred having it uninitialized, as it didn't imply that doing
pruning was the default. Also, it made it simpler to have that comment
about always pruning when we get the cleanup lock.
However, with the changes you mentioned below (got_cleanup_lock), this
discussion is moot.
> > I did not guard against calling lazy_scan_new_or_empty() a second time
> > in the case that lazy_scan_noprune() was called. I can do this. I
> > mentioned upthread I found it confusing for lazy_scan_new_or_empty()
> > to be guarded by if (do_prune). The overhead of calling it wouldn't be
> > terribly high. I can change that based on your opinion of what is
> > better.
>
> To me, the relevant question here isn't reader confusion, because that
> can be cleared up with a comment explaining why we do or do not test
> do_prune. Indeed, I'd say it's a textbook example of when you should
> comment a test: when it might look wrong to the reader but you have a
> good reason for doing it.
>
> But that brings me to what I think the real question is here: do we,
> uh, have a good reason for doing it? At first blush the structure
> looks a bit odd here. lazy_scan_new_or_empty() is intended to handle
> PageIsNew() and PageIsEmpty() cases, lazy_scan_noprune() the cases
> where we process the page without a cleanup lock, and
> lazy_scan_prune() the regular case. So you might think that
> lazy_scan_new_or_empty() would always be applied *first*, that we
> would then conditionally apply lazy_scan_noprune(), and finally
> conditionally apply lazy_scan_prune(). Like this:
>
> bool got_cleanup_lock = ConditionalLockBufferForCleanup(buf);
> if (lazy_scan_new_or_empty())
> continue;
> if (!got_cleanup_lock && !lazy_scan_noprune())
> {
> LockBuffer(buf, BUFFER_LOCK_UNLOCK);
> LockBufferForCleanup(buf);
> got_cleanup_lock = true;
> }
> if (got_cleanup_lock)
> lazy_scan_prune();
>
> The current organization of the code seems to imply that we don't need
> to worry about the PageIsNew() and PageIsEmpty() cases before calling
> lazy_scan_noprune(), and at the moment I'm not understanding why that
> should be the case. I wonder if this is a bug, or if I'm just
> confused.
Yes, I also spent some time thinking about this. In master, we do
always call lazy_scan_new_or_empty() before calling
lazy_scan_noprune(). The code is aiming to ensure we call
lazy_scan_new_or_empty() once before calling either of
lazy_scan_noprune() or lazy_scan_prune(). I think it didn't call
lazy_scan_new_or_empty() unconditionally first because of the
different lock types expected. But, your structure has solved that.
I've used a version of your example code above in attached v9. It is
much nicer.
> > The big open question/functional change is when we consider vacuuming
> > the FSM. Previously, we only considered vacuuming the FSM in the no
> > indexes, has dead items case. After combining the FSM updates from
> > lazy_scan_prune()'s no indexes/has lpdead items case,
> > lazy_scan_prune()'s regular case, and lazy_scan_noprune(), all of them
> > consider vacuuming the FSM. I could guard against this, but I wasn't
> > sure why we would want to only vacuum the FSM in the no indexes/has
> > dead items case.
>
> I don't get it. Conceptually, I agree that we don't want to be
> inconsistent here without some good reason. One of the big advantages
> of unifying different code paths is that you avoid being accidentally
> inconsistent. If different things are different it shows up as a test
> in the code instead of just having different code paths in different
> places that may or may not match.
>
> But I thought the whole point of
> 45d395cd75ffc5b4c824467140127a5d11696d4c was to iron out the existing
> inconsistencies so that we could unify this code without having to
> change any more behavior. In particular, I thought we just made it
> consistently adhere to the principle Peter articulated, where we
> record free space when we're touching the page for presumptively the
> last time. I gather that you think it's still not consistent, but I
> don't understand what the remaining inconsistency is. Can you explain
> further?
Ah, I realize I was not clear. I am now talking about inconsistencies
in vacuuming the FSM itself. FreeSpaceMapVacuumRange(). Not updating
the freespace map during the course of vacuuming the heap relation.
- Melanie
Attachment | Content-Type | Size |
---|---|---|
v9-0002-Move-VM-update-code-to-lazy_scan_prune.patch | text/x-patch | 10.7 KB |
v9-0001-Set-would-be-dead-items-LP_UNUSED-while-pruning.patch | text/x-patch | 14.1 KB |
v9-0003-Inline-LVPagePruneState-members-in-lazy_scan_prun.patch | text/x-patch | 13.4 KB |
v9-0004-Combine-FSM-updates-for-prune-and-no-prune-cases.patch | text/x-patch | 5.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2024-01-17 20:26:10 | Re: initdb's -c option behaves wrong way? |
Previous Message | Robert Haas | 2024-01-17 20:10:13 | Re: cleanup patches for incremental backup |