Re: DB files, sizes and cleanup

From: Bill Moran <wmoran(at)potentialtech(dot)com>
To: "Gauthier, Dave" <dave(dot)gauthier(at)intel(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: DB files, sizes and cleanup
Date: 2010-12-17 17:17:07
Message-ID: 20101217121707.85a719f4.wmoran@potentialtech.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

In response to "Gauthier, Dave" <dave(dot)gauthier(at)intel(dot)com>:

> Hi:
>
> I'm trying to justify disk space for a new linux server they're going to give me for my Postgres instance. When I do a "du" of the place I installed the older instance on the system that is to be replaced, I see that the vast, vast majorityof the space goes to the contents of the "base" dir. In there are a bunch of files with integers for names (iod's ?). And some of those have millions of files inside.
>
> Is this normal? Should there be millions of files in some of these "base" directories?
> Is this indicative of some sort of problem or lack of cleanup that I should have been doing?
>
> The "du" shows that I'm using 196G (again, mostly in "base") but pg_database_size shows something like 1/4 that amount, around 50G. I'd like to know if there's something I'm supposed to be doing to cleanup old (possibly deleted) data.
>
> Also, I was running pg_size_pretty(pg_database_size('mydb')) on all the dbs. It runs very fast for most, but just hangs for two of the databases. Is this indicative of some sort of problem? (BTW, the 2 it hangs on are very much like others that it doesn't hang on, so I used those numbers to estimate the 50G)

1) Do you have autovacuum running, or do you have a regular vacuum
scheduled? Because this seems indicative of no vacuuming, or errors
in vacuuming, or significantly insufficient vacuuming.
2) Unless your databases contain close to 100G of actual data, that size
seems unreasonable.
3) pg_database_size() is probably not "hanging", it's probably just taking
a very long time to stat() millions of files.

Overall, I'm guessing you're not vacuuming your databases on a proper
schedule and that most of that 196G is bloat that doesn't need to be
there. When bloat gets really bad, you're generally better off dumping
the datbases and restoring them, as a vacuum full might take a very,
very long time.

If you can demonstrate that the cause of this is table bloat, then I
would go through all your databases and do a vacuum full/reindex or
do a dump/restore if the problem is very bad. Once you have done that,
your du output should be more realistic and more helpful.

Then, take some time to set up appropriate autovacuum settings so the
problem doesn't come back.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Gauthier, Dave 2010-12-17 17:31:13 Re: DB files, sizes and cleanup
Previous Message Tom Lane 2010-12-17 16:47:38 Re: Table both does not and does exist! wth?