Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row

From: Kirill Reshke <reshkekirill(at)gmail(dot)com>
To: jian he <jian(dot)universality(at)gmail(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Jim Jones <jim(dot)jones(at)uni-muenster(dot)de>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
Date: 2024-11-16 09:55:10
Message-ID: CALdSSPh13qiKmhqwj=bR_3seZJVkP9E6BDbLfHXQ_DbaxQL4FA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 16 Nov 2024 at 13:27, jian he <jian(dot)universality(at)gmail(dot)com> wrote:
>
> On Sat, Nov 9, 2024 at 8:55 PM Kirill Reshke <reshkekirill(at)gmail(dot)com> wrote:
> >
>
> > > > But while I was trying to implement that, I realized that I don't
> > > > understand v4 of this patch. My misunderstanding is about
> > > > `t_on_error_null` tests. We are allowed to insert a NULL value for the
> > > > first column of t_on_error_null using COPY ON_ERROR SET_TO_NULL. Why
> > > > do we do that? My thought is we should try to execute
> > > > InputFunctionCallSafe with NULL value (i mean, here [1]) for the
> > > > column after we failed to insert the input value. And, if this second
> > > > call is successful, we do replacement, otherwise we count the row as
> > > > erroneous.
> > >
> > > Your concern is valid. Allowing NULL to be stored in a column with a NOT NULL
> > > constraint via COPY ON_ERROR=SET_TO_NULL does seem unexpected. As you suggested,
> > > NULL values set by SET_TO_NULL should probably be re-evaluated.
> >
> > Thank you. I updated the patch with a NULL re-evaluation.
> >
>
>
> take me sometime to understand your change with InputFunctionCallSafe.
> it actually works fine with domain,
> i think mainly because domain_in proisstrict is false and all other
> type input function proisstrict is true!
>
>
> --case1
> create table t1(a dnn);
> copy t1 from stdin(on_error set_to_null);
> A
> \.
>
> --case2
> create table t2(a int not null);
> copy t2 from stdin(on_error set_to_null);
> A
> \.
>
> I think it should be to either let domains with not-null behave the
> same as column level not-null
> or just insert NULL to a column with domain not-null constraint.
>
>
> in doc[1], we already mentioned that a column with a not-null domain
> is possible to have null value .
> https://www.postgresql.org/docs/current/sql-createdomain.html
>
> attached v8, based on your v7, main change it NextCopyFrom,
> InputFunctionCallSafe.
> The idea is when InputFunctionCallSafe fails, on_error set_to_null
> needs to check if this is a type as not-null domain.
> pass NULL string to InputFunctionCallSafe again to check if this type
> allows null or not.
> If not allow null then error out (ereport(ERROR)).
> i think this will align with column level not-null constraint, what do
> you guys think?
>
>
> i am mainly change copyfromparse.c's for now.
> other places no change, same as v7.

Hello. I received your email just as I was ready to send my version
eight of this thread.

Your patch does not apply due to 9a70f67.

```
reshke(at)ygp-jammy:~/pg$ git apply v8-0001-COPY-option-on_error-set_to_null.patch
error: patch failed: src/backend/commands/copyfrom.c:1018
error: src/backend/commands/copyfrom.c: patch does not apply
```

1) Your v8 does not fix tab-complete issue mentioned by Atsushi
Torikoshi in [1].

2) Your version does not address discussion about SET_TO_NULL vs
REJECT_LIMIT (see [2] & [3]). I am leaning towards option 1 from [3],
and my v8 implements that.

```
reshke=# create domain dd as int not null;
CREATE DOMAIN
reshke=# create table tt(i dd);
CREATE TABLE
reshke=# copy tt from stdin with (on_error set_to_null, log_verbosity
verbose, reject_limit 2);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> s
>> 1
>> \.
ERROR: domain dd does not allow null values
CONTEXT: COPY tt, line 1, column i: "s"
reshke=#
```

I expect no error here, as reject_limit is specified.

Regression test that checks for this in v7 were changed, in my
opinion, incorrectly.

I am attaching my v8 for reference.

[1] https://www.postgresql.org/message-id/501dd655ddb04693c15baeb6485bc601%40oss.nttdata.com
[2] https://www.postgresql.org/message-id/07587c36-18b3-4ccb-b5fb-579bcb04ed37%40oss.nttdata.com
[3] https://www.postgresql.org/message-id/1462d79784b2475f1c714c65a6f25652%40oss.nttdata.com
--
Best regards,
Kirill Reshke

Attachment Content-Type Size
v8-0001-Introduce-COPY-option-to-replace-columns-containi.patch application/octet-stream 24.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2024-11-16 09:57:23 Re: logical replication: restart_lsn can go backwards (and more), seems broken since 9.4
Previous Message by Yang 2024-11-16 08:37:07 memory leak in pgoutput