Re: Vacuum wait time problem

From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: Roger Ging <rging(at)musicreports(dot)com>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Vacuum wait time problem
Date: 2009-02-13 22:13:28
Message-ID: dcc563d10902131413x3de3936fkdf4348a3ec22dbe6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Fri, Feb 13, 2009 at 2:56 PM, Roger Ging <rging(at)musicreports(dot)com> wrote:
> Scott,
>
> I can only answer a couple of the questions at the moment. I had to kill
> the vacuum full and do a regular vacuum, so I can't get the iostat and
> vmstat outputs right now. This message is the reason I was trying to run
> vacuum full:
>
> INFO: "license": found 257 removable, 20265895 nonremovable row versions in
> 1088061 pages
> DETAIL: 0 dead row versions cannot be removed yet.
> There were 18434951 unused item pointers.
> 687274 pages contain useful free space.
> 0 pages are entirely empty.
> CPU 38.15s/37.02u sec elapsed 621.19 sec.
> WARNING: relation "licensing.license" contains more than "max_fsm_pages"
> pages with useful free space
> HINT: Consider using VACUUM FULL on this relation or increasing the
> configuration parameter "max_fsm_pages".

Yeah, that's pretty bad. ~2 Million live rows and ~18 Million dead
ones is a pretty badly bloated table.

Vacuum full is one way to reclaim that lost space. You can also dump
and restore that one table, inside a drop / create restraints in a
transaction during maintenance if you can. A Vacuum full is quite
intrusive, so avoid it during normal working hours. Dumping and
reloading the whole db may be faster than either a vacuum full or a
cluster. A common trick is to do something like:

begin;
select * into ordermydata from bigbloatedtable order by some_field;
delete * from bigbloatedtable;
insert into bigbloatedtable select * from ordermydata;
commit;

This will both put your table in some order which might help, and
remove the bloat.

> A clean restore of the database to another server create a size on disk of
> about 244GB. This server was at over 400GB yesterday, and now, after
> aggressive vacuuming by hand, is down to 350GB. It had gotten so bad that
> the backup was not finished when I got in yesterday, almost 8 hours after it
> started.

Sounds like either autovacuum isn't running right, or it was turned
off and no decent vacuum schedule was set up by the dba. Or the FSM
is too small, but judging by how bloated this table is, I'd guess
vacuuming got turned off.

> The machine has been under heavy load 24/7 for a couple of months, so I have
> not been able to upgrade versions. I am taking it offline this weekend and
> will install the latest. I'll try to re-create the scenario I had going on
> yesterday over the weekend and get some io statistics.

Definitely look into the check_postgresql.pl script for nagios or
something similar to keep track of bloated tables. Every friday I log
into my production servers and fire off a few of the queries from that
script to check for bloat and make sure everything's running fine.
Another useful trick is to email yourself a copy of the last 20 or 30
lines of vacuum verbose for the whole db every so often (once a week)
and pay attention to your fsm usage.

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Scott Marlowe 2009-02-13 22:15:02 Re: Vacuum wait time problem
Previous Message Tino Schwarze 2009-02-13 22:05:48 Re: Vacuum wait time problem