From: | "Spiegelberg, Greg" <gspiegelberg(at)cranel(dot)com> |
---|---|
To: | "Luke Lonergan" <llonergan(at)greenplum(dot)com>, "Worky Workerson" <worky(dot)workerson(at)gmail(dot)com>, "Merlin Moncure" <mmoncure(at)gmail(dot)com> |
Cc: | <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Best COPY Performance |
Date: | 2006-10-30 14:09:32 |
Message-ID: | 82E74D266CB9B44390D3CCE44A781ED90177807C@POSTOFFICE.cranel.local |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
> -----Original Message-----
> From: pgsql-performance-owner(at)postgresql(dot)org
> [mailto:pgsql-performance-owner(at)postgresql(dot)org] On Behalf Of
> Luke Lonergan
> Sent: Saturday, October 28, 2006 12:07 AM
> To: Worky Workerson; Merlin Moncure
> Cc: pgsql-performance(at)postgresql(dot)org
> Subject: Re: [PERFORM] Best COPY Performance
>
> Worky,
>
> On 10/27/06 8:47 PM, "Worky Workerson"
> <worky(dot)workerson(at)gmail(dot)com> wrote:
>
> > Are you saying that I should be able to issue multiple COPY
> commands
> > because my I/O wait is low? I was under the impression
> that I am I/O
> > bound, so multiple simeoultaneous loads would have a detrimental
> > effect ...
>
> ...
> I agree with Merlin that you can speed things up by breaking
> the file up.
> Alternately you can use the OSS Bizgres java loader, which
> lets you specify the number of I/O threads with the "-n"
> option on a single file.
As a result of this thread, and b/c I've tried this in the past but
never had much success at speeding the process up, I attempted just that
here except via 2 psql CLI's with access to the local file. 1.1M rows
of data varying in width from 40 to 200 characters COPY'd to a table
with only one text column, no keys, indexes, &c took about 15 seconds to
load. ~73K rows/second.
I broke that file into 2 files each of 550K rows and performed 2
simultaneous COPY's after dropping the table, recreating, issuing a sync
on the system to be sure, &c and nearly every time both COPY's finish in
12 seconds. About a 20% gain to ~91K rows/second.
Admittedly, this was a pretty rough test but a 20% savings, if it can be
put into production, is worth exploring for us.
B/c I'll be asked, I did this on an idle, dual 3.06GHz Xeon with 6GB of
memory, U320 SCSI internal drives and PostgreSQL 8.1.4.
Greg
From | Date | Subject | |
---|---|---|---|
Next Message | Luke Lonergan | 2006-10-30 14:23:07 | Re: Best COPY Performance |
Previous Message | Steinar H. Gunderson | 2006-10-30 12:27:33 | Re: Strange plan in pg 8.1.0 |