Re: Eagerly scan all-visible pages to amortize aggressive vacuum

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: Eagerly scan all-visible pages to amortize aggressive vacuum
Date: 2024-12-18 21:29:50
Message-ID: CA+TgmoY1tYff-1CEn8kYt5FsOrynTbtr=UZw=7mTC1Hv1HpeBQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 17, 2024 at 5:54 PM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
> Makes sense. I've attempted to clarify as you suggest in v3.

I would just commit 0001. There's nothing to be gained by waiting around.

I don't care about 0002 much. It doesn't seem particularly better or
worse. I would suggest that if you want to do this, maybe don't split
up this:

vacrel->blkno = InvalidBlockNumber;
if (BufferIsValid(vmbuffer))
ReleaseBuffer(vmbuffer);

You could put the moved code before or after after that instead of the
middle of it. Unless there's some reason it makes more sense in the
middle.

I don't care about 0003 much, either. It looks correct to me, but
whether it's better is subjective.

I dislike 0004 as presented. Reducing the scope of blkno is a false
economy; the function doesn't do much of anything interesting after
the main loop. And why do we want to report rel_pages to the progress
reporting machinery instead of blkno? I'd rather that the code report
where it actually ended up (blkno) rather than reporting where it
thinks it must have ended up (rel_pages).

I agree that the assert that 0005 replaces is confusing, but replacing
8 lines of code with 37 is not an improvement in my book.

I like 0006. The phase I-II-III terminology doesn't appear anywhere in
the code (at least, not to my knowledge) but we speak of it that way
so frequently in email and in conversation that mentioning it
someplace that a future developer might find seems advisable. I think
it would be worth mentioning in the second paragraph of the comment
that we may resume phase I after completing phase III if we entered
phase II due to lack of memory. I'm not sure what the right way to
phrase this is -- in a sense, this possibility means that these aren't
really phases at all, however much we may speak of them that way. But
as I say, we talk about them that way all the time, so it's good that
this is finally adding comments to match.

Regarding 0007:

- Why do we measure failure as an integer total but success as a
percentage? Maybe the thought here is that failures are counted per
region but successes are counted globally across the relation, but
then I wonder if that's what we want, and if so, whether we need some
comments to explain why we want that rather than doing both per
region.

- I do not like the nested struct definition for eager_pages. Either
define the struct separately, give it a name, and then refer to that
name here, or just define the elements individually, and maybe give
them a common prefix or something. I don't think what you have here is
a style we normally use. As you can see, pgindent is not particularly
fond of it. It seems particularly weird given that you didn't even put
all the stuff related to eagerness inside of it.

- It's a little weird that you mostly treat eager vacuums as an
intermediate position between aggressive and normal, but then we
decide whether we're eager in a different place than we decide whether
we're aggressive.

- On a related note, won't most vacuums be VAC_EAGER rather than
VAC_NORMAL, thus making VAC_NORMAL a misnomer? I wonder if it's better
to merge eager and normal together, and just treat the cases where we
judge eager scanning not worthwhile as a mild special case of an
otherwise-normal vacuum. It's important that a user can tell what
happened from the log message, but it doesn't seem absolutely
necessary for the start-of-vacuum message to make that clear. It could
just be that %u all-visible scanned => 0 all-visible scanned means we
didn't end up being at all eager.

- Right after the comment "Produce distinct output for the corner case
all the same, just in case," you've changed it so that there's no
longer an if-statement. While you haven't really falsified the comment
in any deep sense, some rewording is probably in order. Also, the
first sentence of this comment overtly contradicts itself without
really explaining anything useful. That's a preexisting problem for
which you're not responsible, but maybe it makes sense to fix it while
you're adjusting this comment.

- Perhaps it's unfair of me, but I think I would have hoped for an
acknowledgement in this commit message, considering that I believe I
was the one who suggested breaking the relation into logical regions,
trying to freeze a percentage of the all-visible-but-not-all-frozen
pages, and capping both successes and failures. Starting the region at
a random offset wasn't my idea, and the specific thresholds you've
chosen were not the ones I suggested, and limiting successes globally
rather than per-region was not what I think I had in mind, and I don't
mean to take away from everything you've done to move this forward,
but unless I am misunderstanding the situation, this particular patch
(0007) is basically an implementation of an algorithm that, as far as
I know, I was the first to propose.

- Which of course also means that I tend to like the idea, but also
that I'm biased. Still, what is the reasonable alternative to this
patch? I have a hard time believing that it's "do nothing". As far as
I know, pretty much everyone agrees that the large burst of work that
tends to occur when an aggressive vacuum kicks off is extremely
problematic, particularly but not only the first time it kicks off on
a table or group of tables that may have accumulated many
all-visible-but-not-all-frozen pages. This patch might conceivably err
in moving that work too aggressively to earlier vacuums, thus making
those vacuums too expensive or wasting work if the pages end up being
modified again; or it might conceivably err in moving work
insufficiently aggressively to earlier vacuums, leaving too much
remaining work when the aggressive vacuum finally happens. In fact, I
would be surprised if it doesn't have those problems in some
situations. But it could have moderately severe cases of those
problems and still be quite a bit better than what we have now
overall.

- So,I think we should avoid fine-tuning this and try to understand if
there's anything wrong with the big picture. Can we imagine a user who
is systematically unhappy with this change? Like, not a user who got
hosed once because of some bad luck, but someone who is constantly and
predictably getting hosed? They'd need to be accumulating lots of
all-visible-not-all-frozen pages in relatively large tables on a
regular basis, but then I guess they'd need to either drop the tables
before the aggressive vacuum happened, or they'd need to render the
pages not-all-visible again before the aggressive vacuum would have
happened. I'm not entirely sure how possible that is. My best guess is
that it's possible if the timing of your autovacuum runs is
particularly poor -- you just need to load some data, vacuum it early
enough that the XIDs are still young so it doesn't get frozen, then
have the eager vacuum hit it, and then update it. That doesn't seem
impossible, but I'm not sure if it's possible to make it happen often
enough and severely enough to really cause a problem. And I'm not sure
we're going to find that out before this is committed, so that brings
me to:

- If this patch is going to go into the source tree in PostgreSQL 18,
it should do that soon. This is a bad patch to be committing in late
March. I propose that it should be shipped by end of January or wait a
year. Even if it goes into the tree by the end of January, there is a
chance we'll find problems that we can't fix quickly and have to
revert the whole thing. But that becomes much more likely if it is
committed close to feature freeze. There is a certain kind of patch -
and I think it includes this patch - where you just can't ever really
know what the problems people are going to hit IRL are until it's
committed and people try things and hit problems, perhaps not even as
part of an intent to test this patch, but just testing in general. If
we don't find those problems with enough time remaining to react to
them in a thoughtful way, that's where full reverts start to happen.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2024-12-18 22:21:27 Re: Pg18 Recursive Crash
Previous Message Nathan Bossart 2024-12-18 20:56:24 Re: Parametrization minimum password lenght