From: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
---|---|
To: | Dimitri Fontaine <dfontaine(at)hi-media(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org, "Florian G(dot) Pflug" <fgp(at)phlo(dot)org> |
Subject: | Re: An idea for parallelizing COPY within one backend |
Date: | 2008-02-27 10:47:29 |
Message-ID: | 1204109249.4252.477.camel@ebony.site |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, 2008-02-27 at 09:09 +0100, Dimitri Fontaine wrote:
> Hi,
>
> Le mercredi 27 février 2008, Florian G. Pflug a écrit :
> > Upon reception of a COPY INTO command, a backend would
> > .) Fork off a "dealer" and N "worker" processes that take over the
> > client connection. The "dealer" distributes lines received from the
> > client to the N workes, while the original backend receives them
> > as tuples back from the workers.
>
> This looks so much like what pgloader does now (version 2.3.0~dev2, release
> candidate) at the client side, when configured for it, that I can't help
> answering the mail :)
> http://pgloader.projects.postgresql.org/dev/pgloader.1.html#_parallel_loading
> section_threads = N
> split_file_reading = False
>
> Of course, the backends still have to parse the input given by pgloader, which
> only pre-processes data. I'm not sure having the client prepare the data some
> more (binary format or whatever) is a wise idea, as you mentionned and wrt
> Tom's follow-up. But maybe I'm all wrong, so I'm all ears!
ISTM the external parallelization approach is more likely to help us
avoid bottlenecks, so I support Dimitri's approach.
We also need error handling which pgloader also has.
Writing error handling and parallelization into COPY isn't going to be
easy, and not very justifiable either if we already have both.
There might be a reason to re-write it in C one day, but that will be
fairly easy task if we ever need to do it.
--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com
From | Date | Subject | |
---|---|---|---|
Next Message | Richard Huxton | 2008-02-27 11:30:57 | Full text search - altering the default parser |
Previous Message | Dimitri Fontaine | 2008-02-27 10:19:28 | Re: pg_dump additional options for performance |