Re: Enabling Autovacuum Postgres 9.1 (was Unexpectedly high disk space usage)

From: Craig Ringer <craig(at)2ndQuadrant(dot)com>
To: Lists <lists(at)benjamindsmith(dot)com>
Cc: Adrian Klaver <adrian(dot)klaver(at)gmail(dot)com>, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql <pgsql-general(at)postgresql(dot)org>
Subject: Re: Enabling Autovacuum Postgres 9.1 (was Unexpectedly high disk space usage)
Date: 2012-11-13 02:29:08
Message-ID: 50A1B074.6010206@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 11/13/2012 04:04 AM, Lists wrote:
>
> There's a wealth of how to tune PG instruction that's old and (based
> on this thread alone) often stale enough to be classified as
> disinformative. For example, nearest I can tell, the entirety of this
> page is just wrong and/or irrelevant for 9.x and up:
> http://wiki.postgresql.org/wiki/VACUUM_FULL.
Agreed, that needs fixing. I'll sort that out. That information was
important before the VACUUM FULL rewrite, but is now severely outdated.
It needs to be split into pre-9.1 and 9.1+ sections.

> In constrast to the advice to "avoid Vacuum Full" (
> http://wiki.postgresql.org/wiki/Introduction_to_VACUUM,_ANALYZE,_EXPLAIN,_and_COUNT)
> comes the insight that vacuum full is necessary to clean up stale data
> that is not at the "end" of the table. (See Jeff Janes 11/10/2012 email)
That part isn't quite right AFAIK.

If you keep table bloat under control, ordinary VACCUM is perfectly
sufficient. You only need/want to VACUUM FULL if you wish to truncate a
table, reducing its size by compacting rows. In an actively used
database that's usually pretty pointless, since new rows will then be
added to the end, growing the table. You might as well just keep on
re-using the space, rather than going through those compact-and-expand
cycles. You'll suffer from less file fragmentation that way and won't be
paying the costs of file system allocations.

If you have a table that you've dramatically reduced in size (say, by
deleting the vast majority of it) and you won't be adding more rows to
replace the old ones, that's when VACUUM FULL makes sense.

It's a bit like those utilities that claim to "free" or "clean" or
"de-fragment" memory. They seem good, but they're actually grossly
counter-productive, because the system then has to re-read cached data
and otherwise fight to get back to its old equilibrium. It's typically
the same for Pg: you want to aim for equilibrium, not free space that'll
just promptly get re-allocated.

If you do have a bad bloat problem, I'd set a non-default FILLFACTOR
before doing a VACUUM FULL, so you still have some free space within the
table after vacuum completes. That way you won't be immediately paying
the cost of allocating space for new rows as soon as any UPDATEs or
INSERTs come in.

>
>> non-full vacuum can
>> only free space from the "end" of the table.
> This would imply that a full analyze is a good idea, at least
> periodically (weekly/monthly/quarterly) in a database that combines
> the use of temp tables and periodic changes to persistent objects.
> Does autovacuum ever do a "full" analyze"? What about autovacuum and
> the reindex question at the end of this email?
I'm not sure there's any such thing as a full analyze. "VACUUM FULL
ANALYZE" is "Do a VACUUM FULL and an ANALYZE", not "Do a full analyze".

Autovacuum should be taking care of analyze and table statistics. If it
isn't, adjust autovacuum parameters so that it does.
>
>> These two can be killers. Long running transactions can cause
>> autovacuum processes to stall out or be autocancelled.
> "Long running transactions" - is now long? In our system it's rare to
> have a transaction (even a prepared transaction) last much longer than
> a few minutes. Is that enough time to cause problems with AutoVacuum?

Not generally, no.

PostgreSQL can't clean up rows that are still visible to a transaction.
So if your transactions are three minutes long, that's a three minute
delay before VACUUM can clean up DELETEd rows or dead rows left by
UPDATEs. Not a biggie even on a pretty high load DB.

You should generally be concerned only when transactions are open over
"user think time" or are abandoned by buggy applications - cases where
the transaction length is many minutes or hours, potentially unbounded.
Uncommitted prepared transactions are also a problem for similar reasons.

> Lastly, there's the question of reindexing before full vacuum. I've
> observed that not doing a manual reindex prior to vacuum full did not,
> in fact, free up the space, even though I've been told that reindex is
> implicit in the vacuum process. (?!)

VACUUM FULL, post-9.1, should take care of index bloat.

Pre-9.1 VACUUM FULL could make index bloat worse.

Ordinary VACUUM will not truncate indexes AFAIK, only mark free space
within them so it can be re-used. Same deal as with the table its self:
this is usually what you want.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Craig Ringer 2012-11-13 02:32:32 Re: Enabling Autovacuum Postgres 9.1 (was Unexpectedly high disk space usage)
Previous Message Carlos Henrique Reimer 2012-11-13 01:36:49 Re: Running out of memory while making a join