Quick Links

Re: Best COPY Performance

From:	"Luke Lonergan" <llonergan(at)greenplum(dot)com>
To:	"Stefan Kaltenbrunner" <stefan(at)kaltenbrunner(dot)cc>
Cc:	"Spiegelberg, Greg" <gspiegelberg(at)cranel(dot)com>, "Worky Workerson" <worky(dot)workerson(at)gmail(dot)com>, "Merlin Moncure" <mmoncure(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject:	Re: Best COPY Performance
Date:	2006-10-30 15:03:41
Message-ID:	C16B625D.5708%llonergan@greenplum.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Stefan,

On 10/30/06 8:57 AM, "Stefan Kaltenbrunner" <stefan(at)kaltenbrunner(dot)cc> wrote:

>> We've found that there is an ultimate bottleneck at about 12-14MB/s despite
>> having sequential write to disk speeds of 100s of MB/s. I forget what the
>> latest bottleneck was.
>
> I have personally managed to load a bit less then 400k/s (5 int columns
> no indexes) - on very fast disk hardware - at that point postgresql is
> completely CPU bottlenecked (2,6Ghz Opteron).

400,000 rows/s x 4 bytes/column x 5 columns/row = 8MB/s

> Using multiple processes to load the data will help to scale up to about
> 900k/s (4 processes on 4 cores).

18MB/s? Have you done this? I've not seen this much of an improvement
before by using multiple COPY processes to the same table.

Another question: how to measure MB/s - based on the input text file? On
the DBMS storage size? We usually consider the input text file in the
calculation of COPY rate.

- Luke

In response to

Re: Best COPY Performance at 2006-10-30 15:57:19 from Stefan Kaltenbrunner

Responses

Re: Best COPY Performance at 2006-10-30 16:23:44 from Stefan Kaltenbrunner

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Tom Lane	2006-10-30 15:09:29	Re: Strange plan in pg 8.1.0
Previous Message	Mattias Kregert	2006-10-30 14:26:09	Re: Strange plan in pg 8.1.0