From: | "Etsuro Fujita" <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "'Craig Ringer'" <ringerc(at)ringerc(dot)id(dot)au>, <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: WIP patch: add (PRE|POST)PROCESSOR options to COPY |
Date: | 2012-11-14 11:30:49 |
Message-ID: | 001d01cdc25b$7eee6ea0$7ccb4be0$@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I wrote:
> > From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
>
> > I wrote:
> > > "Etsuro Fujita" <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> writes:
> > >> I have a question. I think it would be also better to extend the syntax
> > >> for the SQL COPY command in the same way, ie,
> > >> COPY foo from '/home/pgsql/decompress.sh /home/pgsql/foo.csv.gz |' with
> > >> format 'csv'
> >
> > > Yeah, sure --- that case is already superuser-only, so why not give it
> > > the option of being a popen instead of just fopen.
> >
> > BTW, one thought that comes to mind is that such an operation is
> > extremely likely to fail under environments such as SELinux. That's
> > not necessarily a reason not to do it, but we should be wary of
> > promising that it will work everywhere. Probably a documentation note
> > about this would be enough.
>
> OK I'll revise the patch.
I've revised the patch. In this version a user can specify hooks for pre- and
post-processor executables for COPY and \copy in the follwoing way:
$ echo '/bin/gunzip -c $1' > decompress.sh
$ chmod +x decompress.sh
In the case of the COPY command,
postgres=# COPY foo FROM '/home/pgsql/decompress.sh /home/pgsql/foo.csv.gz'
WITH (format 'csv');
Also, in the case of the \copy instruction,
postgres=# \copy foo from '/home/pgsql/decompress.sh /home/pgsql/foo.csv.gz'
with (format 'csv')
As shown in the example above, I've assumed that the syntax for this option for
e.g., the COPY command is:
COPY table_name FROM 'progname filename' WITH ...
COPY table_name TO 'progname filename' WITH ...
Here, progname for COPY IN is the user-supplied program that takes filename as
its argument and that writes on standard output. Also, prgoname for COPY OUT is
the user-supplied program that reads standard input and writes to filename taken
as its argument. This makes simple the identification and verification of
progname and filename.
Todo:
* Documentation including documentation note about the limitation for
environments such as SELinux mentioned by Tom.
* More test
Any comments and suggestions are welcomed.
Thanks,
Best regards,
Etsuro Fujita
Attachment | Content-Type | Size |
---|---|---|
copy-popen-20121114.patch | application/octet-stream | 14.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2012-11-14 11:41:20 | Re: WIP patch: add (PRE|POST)PROCESSOR options to COPY |
Previous Message | Amit kapila | 2012-11-14 11:26:26 | Re: Performance Improvement by reducing WAL for Update Operation |