Re: Diagnosing outdated auto analyze dates...

From: Keith Fiske <keith(dot)fiske(at)crunchydata(dot)com>
To: Wells Oliver <wells(dot)oliver(at)gmail(dot)com>
Cc: pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Diagnosing outdated auto analyze dates...
Date: 2021-04-21 21:51:01
Message-ID: CAODZiv7sdwbSbgoEt923+MkYG_wsorW1G2oN3Q9nTqLLL0XuoQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Wed, Apr 21, 2021 at 5:38 PM Wells Oliver <wells(dot)oliver(at)gmail(dot)com> wrote:

> Thanks, really useful detail. One last question: if I go the route of
> setting auto-vac scale factor to 0 and auto-vac threshold to 10,000 to
> ensure vacuuming every 10,000 dead tuples, is there still value in
> adjusting the auto-analyze threshold/scale separately? Does the analyze
> process do anything other than determine whether it should be vacuumed?
>

Just to clarify, I only recommend setting scale factor to zero on an
individual table basis and actually figuring out appropriate values for the
threshold to cause those individual tables to be scanned/vacuumed on a
regular basis to keep up with your write rate. The cluster-wide values for
scale factor and threshold that are set in postgresql.conf are generally ok
for most databases, but I personally tune scale factor down a little bit
from the default and threshold up a little bit from the default. But you
definitely want a scale-factor set for the cluster-wide value to ensure any
write tables get analyze/vacuumed at some point based on a fraction of them
changing, not just a specific row number change.

Vacuum is not just for dead-tuple cleanup. It keeps many other important
statistics updated (freespace map, visibility map, etc) and, most
importantly, prevents transaction id exhaustion.

https://www.postgresql.org/docs/13/routine-vacuuming.html

> On Wed, Apr 21, 2021 at 2:33 PM Keith Fiske <keith(dot)fiske(at)crunchydata(dot)com>
> wrote:
>
>>
>>
>> On Wed, Apr 21, 2021 at 5:05 PM Wells Oliver <wells(dot)oliver(at)gmail(dot)com>
>> wrote:
>>
>>> Thank you... so to make it explicit, this table has 15169372 rows,
>>> autovacuum_analyze_scale_factor is .05, and autovacuum_analyze_threshold is
>>> 10, so: 15169372 * .05 + 10, which is 758478.60 -- so that means
>>> auto-analyze will only fire when that number of rows has been deleted or
>>> updated?
>>>
>>>
>>
>> Correct. One thing that can be troublesome with larger tables and the
>> scale factor is that, as time goes on, analyze and vacuum run less and less
>> often unless your write rate also increases along with the size. So that's
>> why larger tables can sometimes benefit from individual tuning and setting
>> scale factor to zero.
>>
>> To figure out what to set the threshold to, I typically make a simple
>> script to do a csv copy output with psql for the pg_stat_user_tables entry
>> for the table(s) involved on like an hourly interval for 2-3 weeks. Then
>> pull that into a spreadsheet or import it back into a table. Then figure
>> out the hourly & daily differences for n_tup_upd + n_tup_del. Then try and
>> set the thresholds to have autovac run at least once or twice a day and
>> analyze slightly more often.
>>
>> Note that prior to PG 13, insert only tables would never cause autovac to
>> kick in. 13 now has autovacuum_vacuum_insert_scale_factor
>> & autovacuum_vacuum_insert_threshold.
>>
>>
>>
>>> On Wed, Apr 21, 2021 at 1:48 PM Keith Fiske <keith(dot)fiske(at)crunchydata(dot)com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Wed, Apr 21, 2021 at 4:06 PM Wells Oliver <wells(dot)oliver(at)gmail(dot)com>
>>>> wrote:
>>>>
>>>>> A last_autoanalyze date in pg_stat_all_tables for a table we update
>>>>> every day is 3/26, another is 3/13 -- I feel strongly somehow these tables
>>>>> should be auto-analyzed and auto-vacuumed more often, and am wondering
>>>>> where to look first.
>>>>>
>>>>> My autovacuum_analyze_threshold is 10, my autovacuum_vacuum_threshold
>>>>> is 25.
>>>>>
>>>>> How can I better understand why these tables are not more frequently
>>>>> being auto-analyzed and vacuumed?
>>>>>
>>>>> Thank you.
>>>>>
>>>>> --
>>>>> Wells Oliver
>>>>> wells(dot)oliver(at)gmail(dot)com <wellsoliver(at)gmail(dot)com>
>>>>>
>>>>
>>>> The threshold settings are only part of the calculation that determines
>>>> when autovacuum kicks in. There's also autovacuum_vacuum_scale_factor
>>>> and autovacuum_analyze_scale_factor. The scale factors say that when this
>>>> fraction of the table has been changed (updates/deletes) then run the
>>>> relevant operation (vacuum or analyze).
>>>>
>>>> The threshold values are added on top of the scale factor values. So,
>>>> what determines whether your analyze kicks in is actually
>>>>
>>>> (autovacuum_analyze_scale_factor * total row
>>>> count) + autovacuum_analyze_threshold
>>>>
>>>> The reason it's like this is to avoid autovacuum kicking in excessively
>>>> on very small tables when a small % changes. Ex. 10 rows on a 100 row table
>>>> constantly kicking in autovacuum when scale factor is set to .10. So add on
>>>> 100-500 for the threshold so autovacuum kicks in when a good number of rows
>>>> actually changes.
>>>>
>>>> If you want a specific number of row changes to cause autovacuum to
>>>> kick in, you can set the scale factor to zero for that specific table. I
>>>> only really find that necessary myself on rather large tables though (10s
>>>> of millions of rows).
>>>>
>>>> --
>>>> Keith Fiske
>>>> Senior Database Engineer
>>>> Crunchy Data - http://crunchydata.com
>>>>
>>>
>>>
>>> --
>>> Wells Oliver
>>> wells(dot)oliver(at)gmail(dot)com <wellsoliver(at)gmail(dot)com>
>>>
>>
>>
>> --
>> Keith Fiske
>> Senior Database Engineer
>> Crunchy Data - http://crunchydata.com
>>
>
>
> --
> Wells Oliver
> wells(dot)oliver(at)gmail(dot)com <wellsoliver(at)gmail(dot)com>
>

--
Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Keith Fiske 2021-04-21 21:57:43 Re: Diagnosing outdated auto analyze dates...
Previous Message Wells Oliver 2021-04-21 21:37:54 Re: Diagnosing outdated auto analyze dates...