Quick Links

Re: COPY Performance

From:	"Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>
To:	"Hans Zaunere" <lists(at)zaunere(dot)com>
Cc:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-general(at)postgresql(dot)org
Subject:	Re: COPY Performance
Date:	2008-05-05 15:01:21
Message-ID:	dcc563d10805050801q11f1c5d3taf3204af3daad957@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Mon, May 5, 2008 at 6:18 AM, Hans Zaunere <lists(at)zaunere(dot)com> wrote:
> > > We're using a statement like this to dump between 500K and >5 million
> > > rows.
> >
> > > COPY(SELECT SomeID FROM SomeTable WHERE SomeColumn > '0')
> > > TO '/dev/shm/SomeFile.csv'
> >
> > > Upon first run, this operation can take several minutes. Upon second
> > > run, it will be complete in generally well under a minute.
> >
> > Hmmm ... define "first" versus "second". What do you do to return it
> > to the slow state?
>
> Interesting that you ask. I haven't found a very reliable way to reproduce
> this.
>
> Typically, just waiting a while to run the same query the second time will
> reproduce this behavior. I restarted postgresql and it was reproduced as
> well. However, I can't find a way to flush buffers/etc, to reproduce the

what happens if you do something like:

select count(*) from (select ...);

i.e. don't make the .csv file each time. How's the performance
without making the csv versus making it?

In response to

Re: COPY Performance at 2008-05-05 12:18:07 from Hans Zaunere

Responses

Re: COPY Performance at 2008-05-05 20:14:08 from Hans Zaunere

Browse pgsql-general by date

	From	Date	Subject
Next Message	Scott Marlowe	2008-05-05 15:03:15	Re: COPY Performance
Previous Message	Michael Enke	2008-05-05 14:41:20	CREATE CHARSET would be nice feature