From: | David Fetter <david(at)fetter(dot)org> |
---|---|
To: | Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Tee for COPY |
Date: | 2015-12-13 13:43:24 |
Message-ID: | 20151213134324.GB28490@fetter.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Dec 13, 2015 at 11:29:23AM +0300, Konstantin Knizhnik wrote:
> Hi,
>
> I am trying to create version of COPY command which can scatter/replicate data to different nodes based on some distribution method.
> There is some master process, having information about data distribution, to which all clients are connected.
> This master process should receive copied data from client and scatters tuples to nodes.
> May be somebody can recommend me the best way of implementing such COPY agent?
>
> The obvious plan is the following:
>
> 1. Register utility callback
> 2. Handle T_CopyStmt in this callback
> 3. Use BeginCopyFrom/NextCopyFrom to receive tuples from client
> 4. Calculate distribution function for the received tuple
> 5. Establish connection with correspondent node (if not yet established) and start the same COPY command to this node (if not started yet).
> 6. Send data to this node using PQputCopyData.
>
> The problem is with step 6: I do not see any way to copy received data to the destination node.
> NextCopyFrom returns array of values (Dutums) of tuple columns. But there are no public methods to send tuple to the copy stream.
> All this logic is implemented in src/backend/commands/copy.c and is not available outside this module.
>
> It is more or less clear how to do it using text or CSV mode: I can use NextCopyFromRawFields and then construct a line with comma separated list of values.
> But how to handle binary mode? Also, I suspect that copy in text mode is significantly slower than in binary mode, isn't it?
>
> The dirty solution is just to cut&paste copy.c code. But may be there is some more elegant way?
A slightly cleaner solution is to make public methods to send tuples
to the copy stream and have COPY call those.
Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2015-12-13 14:05:11 | Re: Move PinBuffer and UnpinBuffer to atomics |
Previous Message | David Fetter | 2015-12-13 13:37:37 | Re: Logical replication and multimaster |