Re: Add limit option to copy function

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: daiho1(dot)kim(at)samsung(dot)com
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, 손우성 <woosung(dot)sohn(at)samsung(dot)com>
Subject: Re: Add limit option to copy function
Date: 2020-01-20 20:29:48
Message-ID: 31094.1579552188@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

=?UTF-8?B?6rmA64yA7Zi4?= <daiho1(dot)kim(at)samsung(dot)com> writes:
> I suggest adding a limit option to the copy function that limits count of input/output.
> I think this will be useful for testing with sample data.

I'm quite skeptical of the value of this. On the output side, you
can already do it with

COPY (SELECT ... LIMIT n) TO wherever;

Moreover, that approach allows you to include an ORDER BY, which is
generally good practice in any query that includes LIMIT, in case
you'd like deterministic results.

On the input side, it's true that you'd have to resort to some
outside features (perhaps applying "head" to the input file, or
some such), or else copy the data into a temp table and post-process.
But that's true for most ways that you might want to adjust or
filter the input data; why should this one be different?

We don't consider that COPY is a general-purpose ETL engine, and
have resisted addition of features to it in the past because
they'd slow down the primary use-case. That objection applies
here too. Yeah, it's (probably) not a big slowdown ... but it's
hard to justify any cost at all for a feature that is outside
the design scope of COPY.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Christoph Berg 2020-01-20 20:47:15 libxml2 is dropping xml2-config
Previous Message Dean Rasheed 2020-01-20 20:18:48 Re: Greatest Common Divisor