Re: Conflict handling for COPY FROM

From: Surafel Temesgen <surafel3000(at)gmail(dot)com>
To: Alexey Kondratov <a(dot)kondratov(at)postgrespro(dot)ru>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Conflict handling for COPY FROM
Date: 2019-11-18 06:42:14
Message-ID: CALAY4q_Yjeh6=TuroMsxTb6rm2iTxycCQmEqkD6aohX7y2_bTw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 15, 2019 at 6:24 PM Alexey Kondratov <a(dot)kondratov(at)postgrespro(dot)ru>
wrote:

> On 11.11.2019 16:00, Surafel Temesgen wrote:
> >
> >
> > Next, you use DestRemoteSimple for returning conflicting tuples back:
> >
> > + dest = CreateDestReceiver(DestRemoteSimple);
> > + dest->rStartup(dest, (int) CMD_SELECT, tupDesc);
> >
> > However, printsimple supports very limited subset of built-in
> > types, so
> >
> > CREATE TABLE large_test (id integer primary key, num1 bigint, num2
> > double precision);
> > COPY large_test FROM '/path/to/copy-test.tsv';
> > COPY large_test FROM '/path/to/copy-test.tsv' ERROR 3;
> >
> > fails with following error 'ERROR: unsupported type OID: 701', which
> > seems to be very confusing from the end user perspective. I've
> > tried to
> > switch to DestRemote, but couldn't figure it out quickly.
> >
> >
> > fixed
>
> Thanks, now it works with my tests.
>
> 1) Maybe it is fine, but now I do not like this part:
>
> + portal = GetPortalByName("");
> + dest = CreateDestReceiver(DestRemote);
> + SetRemoteDestReceiverParams(dest, portal);
> + dest->rStartup(dest, (int) CMD_SELECT, tupDesc);
>
> Here you implicitly use the fact that portal with a blank name is always
> created in exec_simple_query before we get to this point. Next, you
> create new DestReceiver and set it to this portal, but it is also
> already created and set in the exec_simple_query.
>
> Would it be better if you just explicitly pass ready DestReceiver to
> DoCopy (similarly to how it is done for T_ExecuteStmt / ExecuteQuery),
>
>
Good idea .Thank you

>
> 2) My second concern is that you use three internal flags to track
> errors limit:
>
> + int error_limit; /* total number of error to ignore */
> + bool ignore_error; /* is ignore error specified? */
> + bool ignore_all_error; /* is error_limit -1 (ignore all
> error)
> + * specified? */
>
> Though it seems that we can just leave error_limit as a user-defined
> constant and track errors with something like errors_count. In that case
> you do not need auxiliary ignore_all_error flag. But probably it is a
> matter of personal choice.
>
>
using bool flags will save as from using integer type as a boolean and hold
the fact
error limit was specified even if it became zero and it seems to me it is
straightforward
to treat ignore_all_error separately.
Attache is the patch that use already created DestReceiver

regards
Surafel

Attachment Content-Type Size
conflict-handling-onCopy-from-v10.patch text/x-patch 14.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jinbao Chen 2019-11-18 06:48:17 Planner chose a much slower plan in hashjoin, using a large table as the inner table.
Previous Message Masahiko Sawada 2019-11-18 06:40:59 Re: cost based vacuum (parallel)