From: | Alexey Kondratov <a(dot)kondratov(at)postgrespro(dot)ru> |
---|---|
To: | Surafel Temesgen <surafel3000(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Conflict handling for COPY FROM |
Date: | 2019-09-20 13:16:32 |
Message-ID: | 03431db3-7b69-b7a4-43be-f31efb6b44f3@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Surafel,
On 16.07.2019 10:08, Surafel Temesgen wrote:
> i also add an option to ignore all errors in ERROR set to -1
Great!
The patch still applies cleanly (tested on e1c8743e6c), but I've got
some problems using more elaborated tests.
First of all, there is definitely a problem with grammar. In docs ERROR
is defined as option and
COPY test FROM '/path/to/copy-test-simple.csv' ERROR -1;
works just fine, but if modern 'WITH (...)' syntax is used:
COPY test FROM '/path/to/copy-test-simple.csv' WITH (ERROR -1);
ERROR: option "error" not recognized
while 'WITH (error_limit -1)' it works again.
It happens, since COPY supports modern and very-very old syntax:
* In the preferred syntax the options are comma-separated
* and use generic identifiers instead of keywords. The pre-9.0
* syntax had a hard-wired, space-separated set of options.
So I see several options here:
1) Everything is left as is, but then docs should be updated and
reflect, that error_limit is required for modern syntax.
2) However, why do we have to support old syntax here? I guess it exists
for backward compatibility only, but this is a completely new feature.
So maybe just 'WITH (error_limit 42)' will be enough?
3) You also may simply change internal option name from 'error_limit' to
'error' or SQL keyword from 'ERROR' tot 'ERROR_LIMIT'.
I would prefer the second option.
Next, you use DestRemoteSimple for returning conflicting tuples back:
+ dest = CreateDestReceiver(DestRemoteSimple);
+ dest->rStartup(dest, (int) CMD_SELECT, tupDesc);
However, printsimple supports very limited subset of built-in types, so
CREATE TABLE large_test (id integer primary key, num1 bigint, num2
double precision);
COPY large_test FROM '/path/to/copy-test.tsv';
COPY large_test FROM '/path/to/copy-test.tsv' ERROR 3;
fails with following error 'ERROR: unsupported type OID: 701', which
seems to be very confusing from the end user perspective. I've tried to
switch to DestRemote, but couldn't figure it out quickly.
Finally, I simply cannot get into this validation:
+ else if (strcmp(defel->defname, "error_limit") == 0)
+ {
+ if (cstate->ignore_error)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("conflicting or redundant options"),
+ parser_errposition(pstate, defel->location)));
+ cstate->error_limit = defGetInt64(defel);
+ cstate->ignore_error = true;
+ if (cstate->error_limit == -1)
+ cstate->ignore_all_error = true;
+ }
If cstate->ignore_error is defined, then we have already processed
options list, since this is the only one place, where it's set. So we
should never get into this ereport, doesn't it?
Regards
--
Alexey Kondratov
Postgres Professional https://www.postgrespro.com
Russian Postgres Company
Attachment | Content-Type | Size |
---|---|---|
copy-test.tsv | text/tab-separated-values | 24 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2019-09-20 13:46:11 | Re: backup manifests |
Previous Message | Robert Haas | 2019-09-20 12:58:40 | Re: backup manifests |