From: | Corey Huinker <corey(dot)huinker(at)gmail(dot)com> |
---|---|
To: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Craig Ringer <craig(dot)ringer(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: COPY as a set returning function |
Date: | 2016-10-31 20:45:40 |
Message-ID: | CADkLM=ck_M2q=hYEDtau6DVc-kPGp+-8fddJ4oUwoU_ujTGrrg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Attached is a patch that implements copy_srf().
The function signature reflects cstate more than it reflects the COPY
options (filename+is_program instead of FILENAME or PROGRAM, etc)
CREATE OR REPLACE FUNCTION copy_srf(
filename text DEFAULT null,
is_program boolean DEFAULT false,
format text DEFAULT 'text', /* accepts text, csv, binary */
delimiter text DEFAULT null,
null_string text DEFAULT E'\\N',
header boolean DEFAULT false,
quote text DEFAULT null,
escape text DEFAULT null,
encoding text DEFAULT null)
RETURNS SETOF RECORD
The major utility for this (at least for me) will be in ETLs that currently
make a lot of use of temp tables, and wish to either reduce I/O or avoid
pg_attribute bloat.
I have not yet implemented STDIN mode, but it's a start.
$ psql test
psql (10devel)
Type "help" for help.
# select * from copy_srf('echo 1,2; echo 4,5',true,'csv') as t(x text, y
text);
x | y
---+---
1 | 2
4 | 5
(2 rows)
Time: 1.456 ms
# select * from copy_srf('/tmp/small_file.txt'::text,false,'text') as
t(x text, y text);
x | y
-----+-----
1 | 2
a | b
cde | fgh
(3 rows)
On Mon, Oct 17, 2016 at 2:33 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On Fri, Sep 30, 2016 at 9:56 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > Craig Ringer <craig(dot)ringer(at)2ndquadrant(dot)com> writes:
> >> On 1 Oct. 2016 05:20, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >>> I think the last of those suggestions has come up before. It has the
> >>> large advantage that you don't have to remember a different syntax for
> >>> copy-as-a-function.
> >
> >> That sounds fantastic. It'd help this copy variant retain festure parity
> >> with normal copy. And it'd bring us closer to being able to FETCH in non
> >> queries.
> >
> > On second thought, though, this couldn't exactly duplicate the existing
> > COPY syntax, because COPY relies heavily on the rowtype of the named
> > target table to tell it what it's copying. You'd need some new syntax
> > to provide the list of column names and types, which puts a bit of
> > a hole in the "syntax we already know" argument. A SRF-returning-record
> > would have a leg up on that, because we do have existing syntax for
> > defining the concrete rowtype that any particular call returns.
>
> One big disadvantage of SRF-returning-record syntax is that functions
> are basically unwrappable with generic wrappers sans major gymnastics
> such as dynamically generating the query and executing it. This is a
> major disadvantage relative to the null::type hack we use in the
> populate_record style functions and perhaps ought to make this
> (SRF-returning-record syntax) style of use discouraged for useful
> library functions. If there were a way to handle wrapping I'd
> withdraw this minor objection -- this has come up in dblink
> discussions a few times).
>
> merlin
>
Attachment | Content-Type | Size |
---|---|---|
copy_srf_function.diff | text/plain | 13.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Corey Huinker | 2016-10-31 20:54:18 | Re: sequential scan result order vs performance |
Previous Message | Jim Nasby | 2016-10-31 20:24:13 | Increase pltcl test coverage |