Re: autovacuum hung on simple tables

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: senor <frio_cervesa(at)hotmail(dot)com>, "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: autovacuum hung on simple tables
Date: 2022-11-28 07:53:15
Message-ID: 0d8aa1b7534fd01845b7e15651b42f9d949a2dff.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, 2022-11-28 at 04:05 +0000, senor wrote:
> I'm a little late getting back to this but still have no solution.
>
> I had inadvertently executed a vacuum freeze on a table that pg_stat_activity
> indicated was currently being autovacuumed. The manual vacuum succeeded while
> leaving the previous autovacuum still showing active in pg_stat_activity.

No, that cannot be. VACUUM processes are mutually exclusive.
It could have been another autovacuum worker (different "pid").

> Manual vacuum always completes, never stalls, but also often does not appear
> in pg_stat_progress_vacuum unless it's a longer process.

It *does* appear in "pg_stat_progress_vacuum", but perhaps you are too slow
to catch it.

> It appears the process completes the vacuum but does not register that fact.
> relfrozenxid of the main table is what would be expected but an associated
> toast table was still very old. Cancelling all pending vacuums of the table
> and manually running vacuum completes in a few seconds and both the main
> relation and toast are updated as expected with last vacuum time updated
> in pg_stat_all_tables.

Obviously autovacuum devided to process the main table, but not the TOAST
table. That is normal. Manual VACUUM processes both, unless you specify
the option PROCESS_TOAST OFF.

> Autoacuum and autovacuum analyze both get hung.

No, they don't; not unless they are anti-wraparound autovacuum runs and you
are holding high locks on the table in a long running transaction.
They may be slow to complete, because in v11 autovacuum by default is very
slow indeed, as "autovacuum_vacuum_cost_delay" is 20ms.

> I often see the same table listed multiple times in pg_stat_activity with
> different pids and:
> state: active
> backend_type: autovacuum worker
> wait_event_type: blank
> state_change: 10-15 seconds after backend_start - about how long it takes
> to manually run vacuum on most tables.

That can be parallel workers that are used to scan indexes.

> What exactly does autovacuum rely on in the stats file? I ran strace on a
> hung autovacuum process and saw a repeated read of
> /run/postgresql/db_16384.stat (tempfs). The file is 740MB which is about
> the same as other similar installations I've reviewed. I'm lacking in overall
> experience in this though.

I don't know either, and I am too lazy to read the code on that, but I suspect
that it has no connection to your problem.

> One final oddity:
> I ran a query for oldest relfrozenxid and redirected to file. The query took
> around a minute. A few seconds after it finished, I queried for 'active' in
> pg_stat_activity and the oldest relfrozenxid query was still listed.
> A few seconds later it had cleared.

That sounds weird and is hard to believe.
Are the disk or the CPU under extreme stress?

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Srinivasa T N 2022-11-28 10:17:05 Replicating an existing (huge) database
Previous Message Thomas Kellerer 2022-11-28 06:04:18 Re: Get table catalog from pg_indexes