From: | Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com> |
---|---|
To: | karsten vennemann <karsten(at)terragis(dot)net> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: dump of 700 GB database |
Date: | 2010-02-17 22:59:31 |
Message-ID: | dcc563d11002171459jc337bb0tcfc7168935a8a89e@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Wed, Feb 17, 2010 at 3:44 PM, karsten vennemann <karsten(at)terragis(dot)net> wrote:
>
> >>> vacuum should clean out the dead tuples, then cluster on any large tables that are bloated will sort them out without needing too much temporary space.
>
> Yes ok am running a vacuum full on a large table (150GB) and will cluster the spatial data by zip code then. Understand that should get rid of any dead records and reclaim hard disk space then. The system I'm running it on is a 1.7 GB RAM Ubuntu jaunty machine, PostgreSQL 8.3.8.
If you're going to cluster anyways, a vacuum full is wasted effort
there. Unless you're running out of disk space. Cluster uses the
same amount of space as the original table, minus dead rows, for the
destination table.
> I was hesitant to do any of this (vacuum, cluster, or dump and restore) because it might run days or weeks (hopefully not). Here are some of my PostgreSQL.conf settings in case this is not optimal and someone has a hint...
Note that cluster on a randomly ordered large table can be
prohibitively slow, and it might be better to schedule a short
downtime to do the following (pseudo code)
alter table tablename rename to old_tablename;
create table tablename like old_tablename;
insert into tablename select * from old_tablename order by
clustered_col1, clustered_col2;
(creating and moving over FK references as needed.)
> shared_buffers=160MB, effective_cache_size=1GB, maintenance_work_mem=500MB, wal_buffers=16MB, checkpoint_segments=100
What's work_mem set to?
> Also I just set-up a new server (mirror of the other one I need to clean out) specifically for the purpose of running a database dump with enough storage space 2TB...So that is no issue right now
> I really need to find out what is wrong with my procedure dumping the whole database as I never succeed yet to dump and restore such a bid db...
> That will not be the least time I will have to do something similar.
> Here is what I tried ("test" database is 350GB in size)
>
> 1. pg_dump -U postgres -Fc test > /ebsclean/testdb.dump
> This gave me a dump of about 4GB in size (too smal in size even if its compressed ?) after running 5 hours (not bad I thought). But when I tried to restore it using pg_retore to another database (in a different table space)I got an error like "not an valid archive file" or something like that
> So I was wondering if 4GB is a problem in Ubuntu OS ?
What ubuntu? 64 or 32 bit? Have you got either a file system or a
set of pg tools limited to 4Gig file size? Ubuntu 64Bit can def do
files larger than 4G. Pretty sure 4G files have been ok for quite
some time now.
From | Date | Subject | |
---|---|---|---|
Next Message | karsten vennemann | 2010-02-17 23:11:44 | Re: dump of 700 GB database |
Previous Message | karsten vennemann | 2010-02-17 22:44:32 | Re: dump of 700 GB database |