Quick Links

Re: COPY Performance

From:	"Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>
To:	"Hans Zaunere" <lists(at)zaunere(dot)com>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: COPY Performance
Date:	2008-05-05 00:11:15
Message-ID:	dcc563d10805041711q32f2a56cx395e3f2cffbc99ca@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Sun, May 4, 2008 at 5:11 PM, Hans Zaunere <lists(at)zaunere(dot)com> wrote:
> Hello,
>
> We're using a statement like this to dump between 500K and >5 million rows.
>
> COPY(SELECT SomeID FROM SomeTable WHERE SomeColumn > '0')
> TO '/dev/shm/SomeFile.csv'
>
> Upon first run, this operation can take several minutes. Upon second run,
> it will be complete in generally well under a minute.
>

Almost certainly a buffering issue. First time it's reading the file
into memory WHILE also doing other things, file system wise. Second
time it's in memory (kernel cache) and zips right by.

What can you do? First you need to see what's really happening, which
means learning how to drive vmstat, iostat, top, etc to see what's
happening on your machine. You'll likely want to look into doing
something that will reduce contention on the database partition set
for starters. Table spaces, big RAID arrays (big meaning a lot of
spindles), battery backed RAID controller.

In response to

COPY Performance at 2008-05-04 23:11:35 from Hans Zaunere

Responses

Re: COPY Performance at 2008-05-05 12:11:44 from Hans Zaunere

Browse pgsql-general by date

	From	Date	Subject
Next Message	Scott Ribe	2008-05-05 02:37:30	Re: Race condition with notifications
Previous Message	Tom Lane	2008-05-05 00:06:43	Re: Race condition with notifications