Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations
Date: 2022-01-19 14:56:32
Message-ID: CA+TgmoY40ycB53kcroL-k=VgZu+5fG5uNH4MzQZDWNeG0vvTWw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 18, 2022 at 1:48 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> That's what I was reacting to -- it had nothing to do with any
> concerns you may have had. I wasn't thinking about long-idle cursors
> at all. I was defending myself, because I was put in a position where
> I had to defend myself.

I don't think I've said anything on this thread that is an attack on
you. I am getting pretty frustrated with the tenor of the discussion,
though. I feel like you're the one attacking me, and I don't like it.

> I still don't understand why you think that my idea (not yet
> implemented) of making FreezeLimit into a backstop (making it
> autovacuum_freeze_max_age/2 or something) and relying on the new
> "early freezing" criteria for almost everything is going to make the
> situation worse in this scenario with long idle cursors. It's intended
> to make it better.

I just don't understand how I haven't been able to convey my concern
here by now. I've already written multiple emails about it. If none of
them were clear enough for you to understand, I'm not sure how saying
the same thing over again can help. When I say I've already written
about this, I'm referring specifically to the following:

- https://postgr.es/m/CA+TgmobKJm9BsZR3ETeb6MJdLKWxKK5ZXx0XhLf-W9kUgvOcNA@mail.gmail.com
in the second-to-last paragraph, beginning with "I don't really see"
- https://www.postgresql.org/message-id/CA%2BTgmoaGoZ2wX6T4sj0eL5YAOQKW3tS8ViMuN%2BtcqWJqFPKFaA%40mail.gmail.com
in the second paragraph beginning with "Because waiting on a lock"
- https://www.postgresql.org/message-id/CA%2BTgmoZYri_LUp4od_aea%3DA8RtjC%2B-Z1YmTc7ABzTf%2BtRD2Opw%40mail.gmail.com
in the paragraph beginning with "That's the part I'm not sure I
believe."

For all of that, I'm not even convinced that you're wrong. I just
think you might be wrong. I don't really know. It seems to me however
that you're understating the value of waiting, which I've tried to
explain in the above places. Waiting does have the very real
disadvantage of starving the rest of the system of the work that
autovacuum worker would have been doing, and that's why I think you
might be right. However, there are cases where waiting, and only
waiting, gets the job done. If you're not willing to admit that those
cases exist, or you think they don't matter, then we disagree. If you
admit that they exist and think they matter but believe that there's
some reason why increasing FreezeLimit can't cause any damage, then
either (a) you have a good reason for that belief which I have thus
far been unable to understand or (b) you're more optimistic about the
proposed change than can be entirely justified.

> My sense is that there are very few apps that are hopelessly incapable
> of advancing relfrozenxid from day one. I find it much easier to
> believe that users that had this experience got away with it for a
> very long time, until their luck ran out, somehow. I would like to
> minimize the chance of that ever happening, to the extent that that's
> possible within the confines of the basic heapam/vacuumlazy.c
> invariants.

I agree with the idea that most people are OK at the beginning and
then at some point their luck runs out and catastrophe strikes. I
think there are a couple of different kinds of catastrophe that can
happen. For instance, somebody could park a cursor in the middle of a
table someplace and leave it there until the snow melts. Or, somebody
could take a table lock and sit on it forever. Or, there could be a
corrupted page in the table that causes VACUUM to error out every time
it's reached. In the second and third situations, it doesn't matter a
bit what we do with FreezeLimit, but in the first one it might. If the
user is going to leave that cursor sitting there literally forever,
the best solution is to raise FreezeLimit as high as we possibly can.
The system is bound to shut down due to wraparound at some point, but
we at least might as well vacuum other stuff while we're waiting for
that to happen. On the other hand if that user is going to close that
cursor after 10 minutes and open a new one in the same place 10
seconds later, the best thing to do is to keep FreezeLimit as low as
possible, because the first time we wait for the pin to be released
we're guaranteed to advance relfrozenxid within 10 minutes, whereas if
we don't do that we may keep missing the brief windows in which no
cursor is held for a very long time. But we have absolutely no way of
knowing which of those things is going to happen on any particular
system, or of estimating which one is more common in general.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-01-19 14:57:39 Re: New developer papercut - Makefile references INSTALL
Previous Message Tomas Vondra 2022-01-19 14:47:41 Re: [ERROR] Copy from CSV fails due to memory error.