Quick Links

Re: copy vs. C function

From:	Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
To:	Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: copy vs. C function
Date:	2011-12-11 02:32:50
Message-ID:	4EE41652.8090306@ringerc.id.au
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On 12/11/2011 09:27 AM, Jon Nelson wrote:
> The first method involved writing a C program to parse a file, parse
> the lines and output newly-formatted lines in a format that
> postgresql's COPY function can use.
> End-to-end, this takes 15 seconds for about 250MB (read 250MB, parse,
> output new data to new file -- 4 seconds, COPY new file -- 10
> seconds).
Why not `COPY tablename FROM /path/to/myfifo' ?

Just connect your import program up to a named pipe (fifo) created with
`mknod myfifo p` either by redirecting stdout or by open()ing the fifo
for write. Then have Pg read from the fifo. You'll save a round of disk
writes and reads.
> The next approach I took was to write a C function in postgresql to
> parse a single TEXT datum into an array of C strings, and then use
> BuildTupleFromCStrings. There are 8 columns involved.
> Eliding the time it takes to COPY the (raw) file into a temporary
> table, this method took 120 seconds, give or take.
>
> The difference was /quite/ a surprise to me. What is the probability
> that I am doing something very, very wrong?
Have a look at how COPY does it within the Pg sources, see if that's any
help. I don't know enough about Pg's innards to answer this one beyond
that suggestion, sorry.

--
Craig Ringer

In response to

copy vs. C function at 2011-12-11 01:27:11 from Jon Nelson

Responses

Re: copy vs. C function at 2011-12-11 03:08:39 from Jon Nelson

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Sam Gendler	2011-12-11 02:35:09	Re: copy vs. C function
Previous Message	Daniel Cristian Cruz	2011-12-11 01:27:18	Re: Common slow query reasons - help with a special log