Re: Diagnosing outdated auto analyze dates...

From: Wells Oliver <wells(dot)oliver(at)gmail(dot)com>
To: Keith Fiske <keith(dot)fiske(at)crunchydata(dot)com>
Cc: pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Diagnosing outdated auto analyze dates...
Date: 2021-04-21 21:37:54
Message-ID: CAOC+FBXo_KfHh+AzNqOj9DxkAUubW8q4FrNmn2qZEA_w4MtSpg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Thanks, really useful detail. One last question: if I go the route of
setting auto-vac scale factor to 0 and auto-vac threshold to 10,000 to
ensure vacuuming every 10,000 dead tuples, is there still value in
adjusting the auto-analyze threshold/scale separately? Does the analyze
process do anything other than determine whether it should be vacuumed?

On Wed, Apr 21, 2021 at 2:33 PM Keith Fiske <keith(dot)fiske(at)crunchydata(dot)com>
wrote:

>
>
> On Wed, Apr 21, 2021 at 5:05 PM Wells Oliver <wells(dot)oliver(at)gmail(dot)com>
> wrote:
>
>> Thank you... so to make it explicit, this table has 15169372 rows,
>> autovacuum_analyze_scale_factor is .05, and autovacuum_analyze_threshold is
>> 10, so: 15169372 * .05 + 10, which is 758478.60 -- so that means
>> auto-analyze will only fire when that number of rows has been deleted or
>> updated?
>>
>>
>
> Correct. One thing that can be troublesome with larger tables and the
> scale factor is that, as time goes on, analyze and vacuum run less and less
> often unless your write rate also increases along with the size. So that's
> why larger tables can sometimes benefit from individual tuning and setting
> scale factor to zero.
>
> To figure out what to set the threshold to, I typically make a simple
> script to do a csv copy output with psql for the pg_stat_user_tables entry
> for the table(s) involved on like an hourly interval for 2-3 weeks. Then
> pull that into a spreadsheet or import it back into a table. Then figure
> out the hourly & daily differences for n_tup_upd + n_tup_del. Then try and
> set the thresholds to have autovac run at least once or twice a day and
> analyze slightly more often.
>
> Note that prior to PG 13, insert only tables would never cause autovac to
> kick in. 13 now has autovacuum_vacuum_insert_scale_factor
> & autovacuum_vacuum_insert_threshold.
>
>
>
>> On Wed, Apr 21, 2021 at 1:48 PM Keith Fiske <keith(dot)fiske(at)crunchydata(dot)com>
>> wrote:
>>
>>>
>>>
>>> On Wed, Apr 21, 2021 at 4:06 PM Wells Oliver <wells(dot)oliver(at)gmail(dot)com>
>>> wrote:
>>>
>>>> A last_autoanalyze date in pg_stat_all_tables for a table we update
>>>> every day is 3/26, another is 3/13 -- I feel strongly somehow these tables
>>>> should be auto-analyzed and auto-vacuumed more often, and am wondering
>>>> where to look first.
>>>>
>>>> My autovacuum_analyze_threshold is 10, my autovacuum_vacuum_threshold
>>>> is 25.
>>>>
>>>> How can I better understand why these tables are not more frequently
>>>> being auto-analyzed and vacuumed?
>>>>
>>>> Thank you.
>>>>
>>>> --
>>>> Wells Oliver
>>>> wells(dot)oliver(at)gmail(dot)com <wellsoliver(at)gmail(dot)com>
>>>>
>>>
>>> The threshold settings are only part of the calculation that determines
>>> when autovacuum kicks in. There's also autovacuum_vacuum_scale_factor
>>> and autovacuum_analyze_scale_factor. The scale factors say that when this
>>> fraction of the table has been changed (updates/deletes) then run the
>>> relevant operation (vacuum or analyze).
>>>
>>> The threshold values are added on top of the scale factor values. So,
>>> what determines whether your analyze kicks in is actually
>>>
>>> (autovacuum_analyze_scale_factor * total row
>>> count) + autovacuum_analyze_threshold
>>>
>>> The reason it's like this is to avoid autovacuum kicking in excessively
>>> on very small tables when a small % changes. Ex. 10 rows on a 100 row table
>>> constantly kicking in autovacuum when scale factor is set to .10. So add on
>>> 100-500 for the threshold so autovacuum kicks in when a good number of rows
>>> actually changes.
>>>
>>> If you want a specific number of row changes to cause autovacuum to kick
>>> in, you can set the scale factor to zero for that specific table. I only
>>> really find that necessary myself on rather large tables though (10s of
>>> millions of rows).
>>>
>>> --
>>> Keith Fiske
>>> Senior Database Engineer
>>> Crunchy Data - http://crunchydata.com
>>>
>>
>>
>> --
>> Wells Oliver
>> wells(dot)oliver(at)gmail(dot)com <wellsoliver(at)gmail(dot)com>
>>
>
>
> --
> Keith Fiske
> Senior Database Engineer
> Crunchy Data - http://crunchydata.com
>

--
Wells Oliver
wells(dot)oliver(at)gmail(dot)com <wellsoliver(at)gmail(dot)com>

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Keith Fiske 2021-04-21 21:51:01 Re: Diagnosing outdated auto analyze dates...
Previous Message Keith Fiske 2021-04-21 21:32:48 Re: Diagnosing outdated auto analyze dates...