From: | jian he <jian(dot)universality(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | vignesh C <vignesh21(at)gmail(dot)com>, Jim Jones <jim(dot)jones(at)uni-muenster(dot)de>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>, torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row |
Date: | 2025-04-05 08:31:14 |
Message-ID: | CACJufxF6_YwAboiCaVYLRtNpO4kGbXqkXzH_7W=pUvrNXK8WuQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Apr 5, 2025 at 5:33 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Fri, Apr 4, 2025 at 4:55 AM jian he <jian(dot)universality(at)gmail(dot)com> wrote:
> >
> > On Tue, Mar 25, 2025 at 2:31 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> > >
> > > 2) Here in error we say column c1 violates not-null constraint and in
> > > the context we show column c2, should the context also display c2
> > > column:
> > > postgres=# create table t3(c1 int not null, c2 int, check (c1 > 10));
> > > CREATE TABLE
> > > postgres=# COPY t3 FROM STDIN WITH (on_error set_to_null);
> > > Enter data to be copied followed by a newline.
> > > End with a backslash and a period on a line by itself, or an EOF signal.
> > > >> a b
> > > >> \.
> > > ERROR: null value in column "c1" of relation "t3" violates not-null constraint
> > > DETAIL: Failing row contains (null, null).
> > > CONTEXT: COPY t3, line 1, column c2: "b"
> > >
> >
> > It took me a while to figure out why.
> > with the attached, now the error message becomes:
> >
> > ERROR: null value in column "c1" of relation "t3" violates not-null constraint
> > DETAIL: Failing row contains (null, null).
> > CONTEXT: COPY t3, line 1: "a,b"
> >
> > while at it,
> > (on_error set_to_null, log_verbosity verbose)
> > error message CONTEXT will only emit out relation name,
> > this aligns with (on_error ignore, log_verbosity verbose).
> >
> > one of the message out example:
> > +NOTICE: column "b" was set to null due to data type incompatibility at line 2
> > +CONTEXT: COPY t_on_error_null
> >
> >
> >
> > > 3) typo becomen should be become:
> > > null will becomen reserved to non-reserved
> > fixed.
> >
> > > 4) There is a whitespace error while applying patch
> > > Applying: COPY (on_error set_to_null)
> > > .git/rebase-apply/patch:39: trailing whitespace.
> > > a <literal>NOTICE</literal> message indicating the number of rows
> > > warning: 1 line adds whitespace errors.
> > fixed.
>
> I've reviewed the v15 patch and here are some comments:
>
> How about renaming the new option value to 'set_null"? The 'to' in the
> value name seems redundant to me.
>
> ---
> + COPY_ON_ERROR_NULL, /* set error field to null */
>
> I think it's better to rename COPY_ON_ERROR_SET_TO_NULL (or
> COPY_ON_ERROR_SET_NULL if we change the option value name) for
> consistency with the value name.
>
> ---
> + else if (cstate->opts.on_error == COPY_ON_ERROR_NULL)
> + ereport(NOTICE,
> + errmsg_plural("invalid values
> in %" PRIu64 " row was replaced with null",
> +
> "invalid values in %" PRIu64 " rows were replaced with null",
> +
> cstate->num_errors,
> +
> cstate->num_errors));
>
> How about adding "due to data type incompatibility" at the end of the message?
>
> ---
> + ereport(NOTICE,
> + errmsg("column
> \"%s\" was set to null due to data type incompatibility at line %"
> PRIu64 "",
> +
> cstate->cur_attname,
> +
> cstate->cur_lineno));
>
> Similar to the IGNORE case, we can show the data in question in the message.
>
> ---
> + else
> + ereport(ERROR,
> +
> errcode(ERRCODE_NOT_NULL_VIOLATION),
> + errmsg("domain %s does
> not allow null values", format_type_be(typioparams[m])),
> + errdatatype(typioparams[m]));
>
> If domain data type is the sole case where not to accept NULL, can we
> check it beforehand to avoid calling the second
> InputFunctionCallSafe() for non-domain data types? Also, if we want to
> end up with an error when setting NULL to a domain type with NOT NULL,
> I think we don't need to try to handle a soft error by passing
> econtext to InputFunctionCallSafe().
>
please check attached, hope i have addressed all the points you've mentioned.
> If domain data type is the sole case where not to accept NULL, can we
> check it beforehand to avoid calling the second
> InputFunctionCallSafe() for non-domain data types?
I doubt it.
we have
InputFunctionCallSafe(FmgrInfo *flinfo, char *str,
Oid typioparam, int32 typmod,
fmNodePtr escontext,
Datum *result)
{
LOCAL_FCINFO(fcinfo, 3);
if (str == NULL && flinfo->fn_strict)
{
*result = (Datum) 0; /* just return null result */
return true;
}
}
Most of the non-domain type input functions are strict.
see query result:
select proname, pt.typname, proisstrict,pt.typtype
from pg_type pt
join pg_proc pp on pp.oid = pt.typinput
where pt.typtype <> 'd'
and pt.typtype <> 'p'
and proisstrict is false;
so the second InputFunctionCallSafe will be faster for non-domain types.
before CopyFromTextLikeOneRow we don't know if this type is
domain_with_constraint or not.
Beforehand, we can conditionally call DomainHasConstraints to find out.
but DomainHasConstraints is expensive, which may carry extra
performance issues for non-domain types.
but the second InputFunctionCallSafe call will not be a big issue for
domain_with_constraint,
because the first time domain_in call already cached related structs.
Attachment | Content-Type | Size |
---|---|---|
v16-0001-COPY-on_error-set_null.patch | text/x-patch | 20.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Bertrand Drouvot | 2025-04-05 09:37:51 | Re: Draft for basic NUMA observability |
Previous Message | jian he | 2025-04-05 07:29:55 | Re: Support NOT VALID / VALIDATE constraint options for named NOT NULL constraints |