Quick Links

Need help debugging why autovacuum seems "stuck" -- until I use superuser to vacuum freeze pg_database

From:	"McCoy, Shawn" <shamccoy(at)amazon(dot)com>
To:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Need help debugging why autovacuum seems "stuck" -- until I use superuser to vacuum freeze pg_database
Date:	2016-05-02 02:39:02
Message-ID:	A9D40BB7-CFD6-46AF-A0A1-249F04878A2A@amazon.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

I have been debugging a problem on a 9.3.10 Postgres database cluster with over 1200 databases. 10 workers, increased maintenance_work_mem, auto vacuum settings to run more frequently than default. What I will notice is that autovacuum will run for a week or so and traverse databases as expected. I will be able to see that age(datfrozenxid) for all 1200 databases will stay close to autovacuum_freeze_max_age as desired.

Then, suddenly I will see it get “stuck”. Autovacuum launcher will not launch worker processes even though databases start to age past autovacuum_freeze_max_age. If I create a list of databases and sort by age(datfrozenxid), connect to the database with the oldest and execute a simple: "vacuum freeze pg_database;”, autovacuum springs back into action.

It’s never the same database where autovacuum seems to get “stuck”. I’m attempting to gather more debugging information, but, also can’t understand why simply doing a “vacuum freeze pg_database” breaks up the jam.

Any thoughts?

Shawn

Responses

Re: Need help debugging why autovacuum seems "stuck" -- until I use superuser to vacuum freeze pg_database at 2016-05-03 14:51:04 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bruce Momjian	2016-05-02 04:54:03	Re: snapshot too old, configured by time
Previous Message	Tom Lane	2016-05-02 02:00:26	Re: About subxact and xact nesting level...