Re: DB files, sizes and cleanup

From: "Gauthier, Dave" <dave(dot)gauthier(at)intel(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Bill Moran <wmoran(at)potentialtech(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: DB files, sizes and cleanup
Date: 2010-12-17 22:22:24
Message-ID: 482E80323A35A54498B8B70FF2B87980047E266B79@azsmsx504.amr.corp.intel.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

max_fsm_pages = 200000
max_fsm_relations = 12000

There are 12 DBs with roughly 30 tables+indexes each.

There are apparently 2 "bad" DBs. Both identical in terms of data models (clones with different data). I've pg_dummped one of them to a file, dropped the DB (took a long time as millions of files were deleted) and recreated it. It now has 186 files.

ls -1 | wc took a while for the other bad one but eventually came up with exactly 7,949,911 files, so yes, millions. The other one had millions too before I dropped it. Something is clearly wrong. But, since the DB recreate worked for the other one, I'll do the same thing to fix this one too.

What I will need to know then is how to prevent this in the future. It's very odd because the worst of the 2 bad DBs was a sister DB to one that's no problem at all. Here's the picture...

I have a DB, call it "foo", that gets loaded with a ton of data at night. The users query the thing readonly all day. At midnight, an empty DB called "foo_standby", which is identical to "foo" in terms of data model is reloaded from scratch. It takes hours. But when it's done, I do a few rename databases to swap "foo" with "foo_standby" (really just a name swap). "foo_standby" serves as a live backup of yesterday's data. Come the next midnight, I truncate all the tables and start the process all over again.

I say all this because "foo" is the DB with 8 million files in it but "foo_standby" has 186 files. Looks like one of these things is getting vacuumed fine while the other is carrying baggage.

I can't remember, but perhaps one of these 2 is a carry-over from an earlier version of PG (8.1 maybe, or maybe even 7.something). Maybe it had, and still has the millions of files and the vacuum isn't getting to them?

Anyway, your advise on what to set in postgres.conf to make sure this is working would be greatly appreciated.

Thanks for the interest and advise !

-----Original Message-----
From: Merlin Moncure [mailto:mmoncure(at)gmail(dot)com]
Sent: Friday, December 17, 2010 4:19 PM
To: Gauthier, Dave
Cc: Bill Moran; pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] DB files, sizes and cleanup

On Fri, Dec 17, 2010 at 12:31 PM, Gauthier, Dave
<dave(dot)gauthier(at)intel(dot)com> wrote:
> When I restart the DB, it reports... "LOG: autovacuum launcher started".
>
> "ps aux | grep postgres" yields this...
>
> dfg_suse> ps aux | grep postgres
> pgdbadm 22656  0.0  0.0  21296  2616 pts/7    S+   Dec16   0:00
> /usr/intel/pkgs/postgresql/8.3.4/bin/psql -h fcadsql3.fc.intel.com hsxreuse
> pgdbadm  9135  0.0  0.0  50000  5924 pts/10   S    12:22   0:00
> /nfs/hd/itools/em64t_linux26/pkgs/postgresql/8.3.4/bin/postgres -D
> /app/PG/v83
> pgdbadm  9146  0.0  0.0  50000  1360 ?        Ss   12:22   0:00 postgres:
> writer
> process
> pgdbadm  9147  0.0  0.0  50000  1156 ?        Ss   12:22   0:00 postgres:
> wal writer
> process
> pgdbadm  9148  0.0  0.0  50000  1316 ?        Ss   12:22   0:00 postgres:
> autovacuum launcher
> process
> pgdbadm  9149  0.0  0.0  18904  1308 ?        Ss   12:22   0:00 postgres:
> stats collector
> process
> pgdbadm  9354  0.0  0.0   2896   760 pts/9    S+   12:27   0:00 grep
> postgres
>
>
> TSo I assu,e it's running?
>
> This is PG v 8.3.4 on linux.
>
>
>
> -----Original Message-----
> From: Bill Moran [mailto:wmoran(at)potentialtech(dot)com]
> Sent: Friday, December 17, 2010 12:17 PM
> To: Gauthier, Dave
> Cc: pgsql-general(at)postgresql(dot)org
> Subject: Re: [GENERAL] DB files, sizes and cleanup
>
> In response to "Gauthier, Dave" <dave(dot)gauthier(at)intel(dot)com>:
>
>> Hi:
>>
>> I'm trying to justify disk space for a new linux server they're going to
>> give me for my Postgres instance.  When I do a "du" of the place I installed
>> the older instance on the system that is to be replaced, I see that the
>> vast, vast majorityof the space goes to the contents of the "base" dir.  In
>> there are a bunch of files with integers for names (iod's ?).  And some of
>> those have millions of files inside.
>>
>> Is this normal?  Should there be millions of files in some of these "base"
>> directories?
>> Is this indicative of some sort of problem or lack of cleanup that I
>> should have been doing?
>>
>> The "du" shows that I'm using 196G (again, mostly in "base") but
>> pg_database_size shows something like 1/4 that amount, around 50G.  I'd like
>> to know if there's something I'm supposed to be doing to cleanup old
>> (possibly deleted) data.
>>
>> Also, I was running pg_size_pretty(pg_database_size('mydb')) on all the
>> dbs.  It runs very fast for most, but just hangs for two of the databases.
>> Is this indicative of some sort of problem?  (BTW, the 2 it hangs on are
>> very much like others that it doesn't hang on, so I used those numbers to
>> estimate the 50G)
>
> 1) Do you have autovacuum running, or do you have a regular vacuum
>    scheduled?  Because this seems indicative of no vacuuming, or errors
>    in vacuuming, or significantly insufficient vacuuming.
> 2) Unless your databases contain close to 100G of actual data, that size
>    seems unreasonable.
> 3) pg_database_size() is probably not "hanging", it's probably just taking
>    a very long time to stat() millions of files.
>
> Overall, I'm guessing you're not vacuuming your databases on a proper
> schedule and that most of that 196G is bloat that doesn't need to be
> there.  When bloat gets really bad, you're generally better off dumping
> the datbases and restoring them, as a vacuum full might take a very,
> very long time.
>
> If you can demonstrate that the cause of this is table bloat, then I
> would go through all your databases and do a vacuum full/reindex or
> do a dump/restore if the problem is very bad.  Once you have done that,
> your du output should be more realistic and more helpful.
>
> Then, take some time to set up appropriate autovacuum settings so the
> problem doesn't come back.

Check your logs for warnings about the free space map. what are
max_fsm_pages and max_fsm_relations set to? how many tables and
indexes do you have approximately? do you truly have 'millions' of
files?

go into base folder and do:
find | wc -l

merlin

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2010-12-17 22:43:21 Re: DB files, sizes and cleanup
Previous Message Merlin Moncure 2010-12-17 21:18:35 Re: DB files, sizes and cleanup