Re: Unfortunate Nested Loop + Missing Autovacuum

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Lincoln Swaine-Moore <lswainemoore(at)gmail(dot)com>
Cc: Andrei Lepikhov <lepihov(at)gmail(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Unfortunate Nested Loop + Missing Autovacuum
Date: 2025-02-24 02:08:47
Message-ID: CAMkU=1womOrr1Z5obLdot4wQBhzWu_ZvfEiqen0jY93O1F6jPA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sun, Feb 23, 2025 at 5:49 PM Lincoln Swaine-Moore <lswainemoore(at)gmail(dot)com>
wrote:

> Thanks for the reply! I tried the analysis on our much shorter staging
> table and it did change the plan. I haven’t tried it on the production ones
> because my understanding is that the autovacuum process is gentler with
> resource consumption and I didn’t want to gum things up in the meantime.
> But that may be false or avoidable.
>

The default setting of vacuum_cost_delay is zero, while the default setting
of autovacuum_vacuum_cost_delay is 2ms (or 20ms, depending on the software
version). So you can give a manual run the same resource consumption as an
autorun just by changing vacuum_cost_delay in that session so that it has
the same value as autovacuum_vacuum_cost_delay. Also, if you just do
ANALYZE (rather than VACUUM ANALYZE) the resource usage should be
rather modest anyway.

Another difference is that autovac will cancel itself if it detects it is
blocking something else, while a manual vac/analyze operation will not
do that. Normal operations (DML) don't block against vacuum anyway, only
things like index creation or partition maintenance do that. But if those
types of operation are frequent, then doing a manual VACUUM or ANALYZE
could indeed gum things up. Also, if those operations are frequent, it
could explain the missing autovac. If every auto attempt gets cancelled
before it completes, then it will never complete. pg_stat_user_tables
doesn't reflect cancelled vacuum or analyze so those will go missing. (You
should see mentions of cancelled autovac operations in the log file though.)

Database restarts will also interrupt vacuums. So if your database is
shutdown and restarted regularly (for cold back-ups, or just out of some
misguided belief that restarting occasionally is a good practise) and the
period between restarts is shorter than how long it would take autovac to
run, this could also explain the lack of completed autovacs. Also, if a
table qualifies for both auto vacuum and auto analyze, the vacuum is done
first. So even if auto analyze would be fast by itself, it still won't
complete if auto vacuum is slow and never gets to finish.

It is possible to override your vacuum settings on a per-table basis. So
another possible explanation for the missing autovacs is that those two
tables have been specifically configured to disable autovacuum on them and
only them. If you use psql, \dt+ (but not regular \dt) will show such
customizations. I'm sure other tools also have ways to detect this, but I
don't know what those ways are off the top of my head.

Cheers,

Jeff

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Andrei Lepikhov 2025-02-24 14:01:16 Re: Unfortunate Nested Loop + Missing Autovacuum
Previous Message Lincoln Swaine-Moore 2025-02-23 22:49:06 Re: Unfortunate Nested Loop + Missing Autovacuum