From: | "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com> |
---|---|
To: | "Hans Zaunere" <lists(at)zaunere(dot)com> |
Cc: | "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: COPY Performance |
Date: | 2008-05-05 15:01:21 |
Message-ID: | dcc563d10805050801q11f1c5d3taf3204af3daad957@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Mon, May 5, 2008 at 6:18 AM, Hans Zaunere <lists(at)zaunere(dot)com> wrote:
> > > We're using a statement like this to dump between 500K and >5 million
> > > rows.
> >
> > > COPY(SELECT SomeID FROM SomeTable WHERE SomeColumn > '0')
> > > TO '/dev/shm/SomeFile.csv'
> >
> > > Upon first run, this operation can take several minutes. Upon second
> > > run, it will be complete in generally well under a minute.
> >
> > Hmmm ... define "first" versus "second". What do you do to return it
> > to the slow state?
>
> Interesting that you ask. I haven't found a very reliable way to reproduce
> this.
>
> Typically, just waiting a while to run the same query the second time will
> reproduce this behavior. I restarted postgresql and it was reproduced as
> well. However, I can't find a way to flush buffers/etc, to reproduce the
what happens if you do something like:
select count(*) from (select ...);
i.e. don't make the .csv file each time. How's the performance
without making the csv versus making it?
From | Date | Subject | |
---|---|---|---|
Next Message | Scott Marlowe | 2008-05-05 15:03:15 | Re: COPY Performance |
Previous Message | Michael Enke | 2008-05-05 14:41:20 | CREATE CHARSET would be nice feature |