From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Surafel Temesgen <surafel3000(at)gmail(dot)com> |
Cc: | berlin(dot)ab(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: COPY FROM WHEN condition |
Date: | 2018-11-24 02:09:25 |
Message-ID: | 8c9edfdf-6fad-7c68-8969-e1e315baa7a4@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 11/23/18 12:14 PM, Surafel Temesgen wrote:
>
>
> On Sun, Nov 11, 2018 at 11:59 PM Tomas Vondra
> <tomas(dot)vondra(at)2ndquadrant(dot)com <mailto:tomas(dot)vondra(at)2ndquadrant(dot)com>> wrote:
>
>
> So, what about using FILTER here? We already use it for aggregates when
> filtering rows to process.
>
> i think its good idea and describe its purpose more. Attache is a
> patch that use FILTER instead
Thanks, looks good to me. A couple of minor points:
1) While comparing this to the FILTER clause we already have for
aggregates, I've noticed the aggregate version is
FILTER '(' WHERE a_expr ')'
while here we have
FILTER '(' a_expr ')'
For a while I was thinking that maybe we should use the same syntax
here, but I don't think so. The WHERE bit seems rather unnecessary and
we probably implemented it only because it's required by SQL standard,
which does not apply to COPY. So I think this is fine.
2) The various parser checks emit errors like this:
case EXPR_KIND_COPY_FILTER:
err = _("cannot use subquery in copy from FILTER condition");
break;
I think the "copy from" should be capitalized, to make it clear that it
refers to a COPY FROM command and not a copy of something.
3) I think there should be regression tests for these prohibited things,
i.e. for a set-returning function, for a non-existent column, etc.
4) What might be somewhat confusing for users is that the filter uses a
single snapshot to evaluate the conditions for all rows. That is, one
might do this
create or replace function f() returns int as $$
select count(*)::int from t;
$$ language sql;
and hope that
copy t from '/...' filter (f() <= 100);
only ever imports the first 100 rows - but that's not true, of course,
because f() uses the snapshot acquired at the very beginning. For
example INSERT SELECT does behave differently:
test=# copy t from '/home/user/t.data' filter (f() < 100);
COPY 81
test=# insert into t select * from t where f() < 100;
INSERT 0 19
Obviously, this is not an issue when the filter clause references only
the input row (which I assume will be the primary use case).
Not sure if this is expected / appropriate behavior, and if the patch
needs to do something else here.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Anne Marie Harm | 2018-11-24 02:31:23 | could not connect to server, in order to operate pgAdmin/PostgreSQL |
Previous Message | Michael Paquier | 2018-11-24 01:35:03 | Re: pgsql: Add WL_EXIT_ON_PM_DEATH pseudo-event. |