From: | "Gauthier, Dave" <dave(dot)gauthier(at)intel(dot)com> |
---|---|
To: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
Cc: | Bill Moran <wmoran(at)potentialtech(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: DB files, sizes and cleanup |
Date: | 2010-12-17 22:22:24 |
Message-ID: | 482E80323A35A54498B8B70FF2B87980047E266B79@azsmsx504.amr.corp.intel.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
max_fsm_pages = 200000
max_fsm_relations = 12000
There are 12 DBs with roughly 30 tables+indexes each.
There are apparently 2 "bad" DBs. Both identical in terms of data models (clones with different data). I've pg_dummped one of them to a file, dropped the DB (took a long time as millions of files were deleted) and recreated it. It now has 186 files.
ls -1 | wc took a while for the other bad one but eventually came up with exactly 7,949,911 files, so yes, millions. The other one had millions too before I dropped it. Something is clearly wrong. But, since the DB recreate worked for the other one, I'll do the same thing to fix this one too.
What I will need to know then is how to prevent this in the future. It's very odd because the worst of the 2 bad DBs was a sister DB to one that's no problem at all. Here's the picture...
I have a DB, call it "foo", that gets loaded with a ton of data at night. The users query the thing readonly all day. At midnight, an empty DB called "foo_standby", which is identical to "foo" in terms of data model is reloaded from scratch. It takes hours. But when it's done, I do a few rename databases to swap "foo" with "foo_standby" (really just a name swap). "foo_standby" serves as a live backup of yesterday's data. Come the next midnight, I truncate all the tables and start the process all over again.
I say all this because "foo" is the DB with 8 million files in it but "foo_standby" has 186 files. Looks like one of these things is getting vacuumed fine while the other is carrying baggage.
I can't remember, but perhaps one of these 2 is a carry-over from an earlier version of PG (8.1 maybe, or maybe even 7.something). Maybe it had, and still has the millions of files and the vacuum isn't getting to them?
Anyway, your advise on what to set in postgres.conf to make sure this is working would be greatly appreciated.
Thanks for the interest and advise !
-----Original Message-----
From: Merlin Moncure [mailto:mmoncure(at)gmail(dot)com]
Sent: Friday, December 17, 2010 4:19 PM
To: Gauthier, Dave
Cc: Bill Moran; pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] DB files, sizes and cleanup
On Fri, Dec 17, 2010 at 12:31 PM, Gauthier, Dave
<dave(dot)gauthier(at)intel(dot)com> wrote:
> When I restart the DB, it reports... "LOG: autovacuum launcher started".
>
> "ps aux | grep postgres" yields this...
>
> dfg_suse> ps aux | grep postgres
> pgdbadm 22656 0.0 0.0 21296 2616 pts/7 S+ Dec16 0:00
> /usr/intel/pkgs/postgresql/8.3.4/bin/psql -h fcadsql3.fc.intel.com hsxreuse
> pgdbadm 9135 0.0 0.0 50000 5924 pts/10 S 12:22 0:00
> /nfs/hd/itools/em64t_linux26/pkgs/postgresql/8.3.4/bin/postgres -D
> /app/PG/v83
> pgdbadm 9146 0.0 0.0 50000 1360 ? Ss 12:22 0:00 postgres:
> writer
> process
> pgdbadm 9147 0.0 0.0 50000 1156 ? Ss 12:22 0:00 postgres:
> wal writer
> process
> pgdbadm 9148 0.0 0.0 50000 1316 ? Ss 12:22 0:00 postgres:
> autovacuum launcher
> process
> pgdbadm 9149 0.0 0.0 18904 1308 ? Ss 12:22 0:00 postgres:
> stats collector
> process
> pgdbadm 9354 0.0 0.0 2896 760 pts/9 S+ 12:27 0:00 grep
> postgres
>
>
> TSo I assu,e it's running?
>
> This is PG v 8.3.4 on linux.
>
>
>
> -----Original Message-----
> From: Bill Moran [mailto:wmoran(at)potentialtech(dot)com]
> Sent: Friday, December 17, 2010 12:17 PM
> To: Gauthier, Dave
> Cc: pgsql-general(at)postgresql(dot)org
> Subject: Re: [GENERAL] DB files, sizes and cleanup
>
> In response to "Gauthier, Dave" <dave(dot)gauthier(at)intel(dot)com>:
>
>> Hi:
>>
>> I'm trying to justify disk space for a new linux server they're going to
>> give me for my Postgres instance. When I do a "du" of the place I installed
>> the older instance on the system that is to be replaced, I see that the
>> vast, vast majorityof the space goes to the contents of the "base" dir. In
>> there are a bunch of files with integers for names (iod's ?). And some of
>> those have millions of files inside.
>>
>> Is this normal? Should there be millions of files in some of these "base"
>> directories?
>> Is this indicative of some sort of problem or lack of cleanup that I
>> should have been doing?
>>
>> The "du" shows that I'm using 196G (again, mostly in "base") but
>> pg_database_size shows something like 1/4 that amount, around 50G. I'd like
>> to know if there's something I'm supposed to be doing to cleanup old
>> (possibly deleted) data.
>>
>> Also, I was running pg_size_pretty(pg_database_size('mydb')) on all the
>> dbs. It runs very fast for most, but just hangs for two of the databases.
>> Is this indicative of some sort of problem? (BTW, the 2 it hangs on are
>> very much like others that it doesn't hang on, so I used those numbers to
>> estimate the 50G)
>
> 1) Do you have autovacuum running, or do you have a regular vacuum
> scheduled? Because this seems indicative of no vacuuming, or errors
> in vacuuming, or significantly insufficient vacuuming.
> 2) Unless your databases contain close to 100G of actual data, that size
> seems unreasonable.
> 3) pg_database_size() is probably not "hanging", it's probably just taking
> a very long time to stat() millions of files.
>
> Overall, I'm guessing you're not vacuuming your databases on a proper
> schedule and that most of that 196G is bloat that doesn't need to be
> there. When bloat gets really bad, you're generally better off dumping
> the datbases and restoring them, as a vacuum full might take a very,
> very long time.
>
> If you can demonstrate that the cause of this is table bloat, then I
> would go through all your databases and do a vacuum full/reindex or
> do a dump/restore if the problem is very bad. Once you have done that,
> your du output should be more realistic and more helpful.
>
> Then, take some time to set up appropriate autovacuum settings so the
> problem doesn't come back.
Check your logs for warnings about the free space map. what are
max_fsm_pages and max_fsm_relations set to? how many tables and
indexes do you have approximately? do you truly have 'millions' of
files?
go into base folder and do:
find | wc -l
merlin
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2010-12-17 22:43:21 | Re: DB files, sizes and cleanup |
Previous Message | Merlin Moncure | 2010-12-17 21:18:35 | Re: DB files, sizes and cleanup |