From: | Frédéric Yhuel <frederic(dot)yhuel(at)dalibo(dot)com> |
---|---|
To: | Nathan Bossart <nathandbossart(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com> |
Subject: | Re: New GUC autovacuum_max_threshold ? |
Date: | 2024-04-25 20:57:08 |
Message-ID: | a7403191-a4c2-4629-aebe-0106f8f0fc04@dalibo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Le 25/04/2024 à 21:21, Nathan Bossart a écrit :
> On Thu, Apr 25, 2024 at 02:33:05PM -0400, Robert Haas wrote:
>> What does surprise me is that Frédéric suggests a default value of
>> 500,000. If half a million tuples (proposed default) is 20% of your
>> table (default value of autovacuum_vacuum_scale_factor) then your
>> table has 2.5 million tuples. Unless those tuples are very wide, that
>> table isn't even 1GB in size. I'm not aware that there's any problem
>> at all with the current formula on a table of that size, or even ten
>> times that size. I think you need to have tables that are hundreds of
>> gigabytes in size at least before this starts to become a serious
>> problem. Looking at this from another angle, in existing releases, the
>> maximum usable amount of autovacuum_work_mem is 1GB, which means we
>> can store one-sixth of a billion dead TIDs, or roughly 166 million.
>> And that limit has been a source of occasional complaints for years.
>> So we have those complaints on the one hand, suggesting that 166
>> million is not enough, and then we have this proposal, saying that
>> more than half a million is too much. That's really strange; my
>> initial hunch is that the value should be 100-500x higher than what
>> Frédéric proposed.
>
> Agreed, the default should probably be on the order of 100-200M minimum.
>
I'm not sure... 500000 comes from the table given in a previous message.
It may not be large enough. But vacuum also updates the visibility map,
and a few hundred thousand heap fetches can already hurt the performance
of an index-only scan, even if most of the blocs are read from cache.
> The original proposal also seems to introduce one parameter that would
> affect all three of autovacuum_vacuum_threshold,
> autovacuum_vacuum_insert_threshold, and autovacuum_analyze_threshold. Is
> that okay? Or do we need to introduce a "limit" GUC for each? I guess the
> question is whether we anticipate any need to have different values for
> these limits, which might be unlikely.
>
I agree with you, it seems unlikely. This is also an answer to Melanie's
question about the name of the GUC : I deliberately left out the other
"vacuum" because I thought we only needed one parameter for these three
thresholds.
Now I have just read Robert's new message, and I understand his point.
But is there a real problem with triggering analyze after every 500000
(or more) modifications in the table anyway?
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2024-04-25 21:05:32 | Re: Why don't we support external input/output functions for the composite types |
Previous Message | Robert Haas | 2024-04-25 20:21:02 | Re: New GUC autovacuum_max_threshold ? |