Autovaccum failure with Postgres 8.3.14? (max-fsm_pages issue?)

From: Subhasis Bhattacharya <subhasis(dot)bhattacharya(at)gmail(dot)com>
To: pgsql-admin(at)postgresql(dot)org
Subject: Autovaccum failure with Postgres 8.3.14? (max-fsm_pages issue?)
Date: 2012-11-23 11:32:37
Message-ID: CANOU=99aSuPuHw_LK7fLyaRnh_3bpGQnTDcPB6SEFCo23pg_hA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hi All,

Recently a postgres database embedded within our product faced a series of
hiccups at a customer site, as follows:

For about 24 h ours the postgres log file had logged errors like:

2012-09-24 00:00:12 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:13 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:15 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:16 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:20 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:21 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:22 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:22 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:23 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error
2012-09-24 00:00:24 GMTLOG: could not rename temporary statistics file
"global/pgstat.tmp" to "global/pgstat.stat": Input/output error

....
Its an extremely volatile database with rows deleted very often....

After about 24 hours we saw this:
2012-10-01 00:19:21 GMTWARNING: relation "pg_toast.pg_toast_16509"
contains more than "max_fsm_pages" pages with useful free space
2012-10-01 00:19:21 GMTHINT: Consider using VACUUM FULL on this relation
or increasing the configuration parameter "max_fsm_pages".
2012-10-01 00:57:12 GMTWARNING: relation "pbs.job_attr" contains more than
"max_fsm_pages" pages with useful free space
2012-10-01 00:57:12 GMTHINT: Consider using VACUUM FULL on this relation
or increasing the configuration parameter "max_fsm_pages".
2012-10-01 01:09:30 GMTWARNING: relation "pg_toast.pg_toast_16509"
contains more than "max_fsm_pages" pages with useful free space
2012-10-01 01:09:30 GMTHINT: Consider using VACUUM FULL on this relation
or increasing the configuration parameter "max_fsm_pages".
2012-10-01 01:18:45 GMTERROR: canceling autovacuum task

So looks like autovacuum encountered a situation it could not handle and
was asking for a manual vacuum full. Queries henceforth ran very very slow,
since autovacuum kept popping up and cancelling itself...

My question to the postgres guru's or developer's:

- I read that the autovacuum daemon depends on the statistics collector to
do its job properly. Could it be that the fact that the collector could not
update the pgstat file earlier for over 24 hours or so, led autovacuum to
NOT do its job leading to the situation where autovacuum could no longer
handle it....

- In a normal situation, should we assume that autovacuum must be able to
do it job without needing a manual vacuum full ever?

Thanks and Regards,
Subhasis Bhattacharya

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Craig Ringer 2012-11-23 13:05:16 Re: Autovaccum failure with Postgres 8.3.14? (max-fsm_pages issue?)
Previous Message Albe Laurenz 2012-11-22 16:05:49 Re: timezone, how postgres deal with it?