From: | Soeren Gerlach <soeren(at)all-about-shift(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Slow dump with pg_dump/pg_restore ? How to improve ? |
Date: | 2004-06-30 22:08:46 |
Message-ID: | 200407010008.46744.soeren@all-about-shift.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
> > So...the dump in the above format needs some 14 minutes, the restore 10
> > minutes. This seems to be very slow as it means something like 100K/sec
> > for dumping and restoring. The drive is cappable of 40 Meg/seconds, so
> > thats not the bottleneck ,-) Anyhow postmaster and pg_dump seem to max
> > out the CPU cycles as it's running at nearly 100% while dumping and
> > restoring.
>
> What datatypes have you got in the large tables? Also, what character
> set encoding are you using?
datestyle = 'ISO,European' ?
> The only reason I can think of for dump to be that slow is if conversion
> of the data to text is a big time sink. This would involve the
> datatype's own output routine plus possibly a character set conversion.
> You should at least make sure that no character set conversion needs to
> happen (offhand I think this would only be an issue if pg_dump is
> invoked with PGCLIENTENCODING set in its environment).
This are the schemas of the two big tables mentioned:
-------------------------------------------------------------------------
CREATE TABLE public.agent_action_history
(
aah_tag_id bigserial NOT NULL,
aah_action_type int2 NOT NULL DEFAULT 0,
aah_quantity float4 NOT NULL DEFAULT 0,
aah_price float4 NOT NULL DEFAULT 0,
aah_sim_flag bool NOT NULL DEFAULT false,
aah_timestamp timestamp NOT NULL,
aah_action_pl float4 NOT NULL DEFAULT 0
) WITHOUT OIDS;
CREATE TABLE public.tick_history
(
tkh_id serial NOT NULL,
tkh_cdt_id int8 NOT NULL,
tkh_price float4 NOT NULL,
tkh_price_type int2 NOT NULL,
tkh_volume float4 NOT NULL,
tkh_system_time timestamp(6) NOT NULL,
tkh_exchange_time timestamp(6) NOT NULL,
CONSTRAINT "PK_tick_history" PRIMARY KEY (tkh_id)
) WITHOUT OIDS;
-------------------------------------------------------------------------
> Also I trust you are using dump with the default COPY-style output,
> not dump-as-INSERTs?
Yes. I'm dumping using "-c -Fc -v" as options. I just rerun the whole:
* Total rows: 904,000
* Dumping plain format: 21min == 1260 sec == 717 rows/sec
* Dumping "-c -Fc" format: 26min == 1560 sec == 580 rows/sec
* pg_dump takes 2/3 of the [single] CPU, postmaster the other 1/3 for both
dumps
* the resulting "-Fc" dump is 1/10 of the size of the plain dump
It would be interesting to know from people with DBs > 1 Gig how fast they
can dump in terms of rows/second.
Another info: I just noticed that I've 7.4.1 running not 7.4.3. So I'll
retest this tomorrow again and provide you with numbers for 7.4.3 too.
I'm likely to install Sybase as a reference to Postgres next week. But I can
remember to got there something like 5-10,000 rows/sec some one year ago
when I tested a 11.5 version for this issue too.
Thanks,
Soeren Gerlach
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2004-06-30 22:10:26 | Re: Internationalization |
Previous Message | Chris Browne | 2004-06-30 21:43:02 | Re: what's best: opteron or xeon for pg 7.4.3? |