Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Kirill Reshke <reshkekirill(at)gmail(dot)com>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, Jim Jones <jim(dot)jones(at)uni-muenster(dot)de>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
Date: 2024-11-07 18:00:54
Message-ID: 07587c36-18b3-4ccb-b5fb-579bcb04ed37@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2024/10/26 6:03, Kirill Reshke wrote:
> when the REJECT LIMIT is set to some non-zero number and the number of
> row NULL replacements exceeds the limit, is it OK to fail. Because
> there WAS errors, and we should not tolerate more than $limit errors .
> I do find this behavior to be consistent.

+1

> But what if we don't set a REJECT LIMIT, it is sane to do all
> replacements, as if REJECT LIMIT is inf.

+1

> But our REJECT LIMIT is zero
> (not set).
> So, we ignore zero REJECT LIMIT if set_to_null is set.

REJECT_LIMIT currently has to be greater than zero, so it won’t ever be zero.

> But while I was trying to implement that, I realized that I don't
> understand v4 of this patch. My misunderstanding is about
> `t_on_error_null` tests. We are allowed to insert a NULL value for the
> first column of t_on_error_null using COPY ON_ERROR SET_TO_NULL. Why
> do we do that? My thought is we should try to execute
> InputFunctionCallSafe with NULL value (i mean, here [1]) for the
> column after we failed to insert the input value. And, if this second
> call is successful, we do replacement, otherwise we count the row as
> erroneous.

Your concern is valid. Allowing NULL to be stored in a column with a NOT NULL
constraint via COPY ON_ERROR=SET_TO_NULL does seem unexpected. As you suggested,
NULL values set by SET_TO_NULL should probably be re-evaluated.

> Hm, good catch. Applied almost as you suggested. I did tweak this
> "replace columns with invalid input values with " into "replace
> columns containing erroneous input values with". Is that OK?

Yes, sounds good.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2024-11-07 18:12:50 Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible
Previous Message Peter Geoghegan 2024-11-07 17:55:14 Re: index prefetching