Re: Small fix on COPY ON_ERROR document

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>
Cc: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Small fix on COPY ON_ERROR document
Date: 2024-02-02 02:29:41
Message-ID: ddc70a50e2cae563ad3a2a90de3c0922@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2024-02-01 15:16, Yugo NAGATA wrote:
> On Mon, 29 Jan 2024 15:47:25 +0900
> Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp> wrote:
>
>> On Sun, 28 Jan 2024 19:14:58 -0700
>> "David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
>>
>> > > Also, I think "invalid input syntax" is a bit ambiguous. For example,
>> > > COPY FROM raises an error when the number of input column does not match
>> > > to the table schema, but this error is not ignored by ON_ERROR while
>> > > this seems to fall into the category of "invalid input syntax".
>> >
>> >
>> >
>> > It is literally the error text that appears if one were not to ignore it.
>> > It isn’t a category of errors. But I’m open to ideas here. But being
>> > explicit with what on actually sees in the system seemed preferable to
>> > inventing new classification terms not otherwise used.
>>
>> Thank you for explanation! I understood the words was from the error
>> messages
>> that users actually see. However, as Torikoshi-san said in [1], errors
>> other
>> than valid input syntax (e.g. range error) can be also ignored,
>> therefore it
>> would be better to describe to be ignored errors more specifically.
>>
>> [1]
>> https://www.postgresql.org/message-id/7f1457497fa3bf9dfe486f162d1c8ec6%40oss.nttdata.com
>>
>> >
>> > >
>> > > So, keeping consistency with the existing description, we can say:
>> > >
>> > > "Specifies which how to behave when encountering an error due to
>> > > column values unacceptable to the input function of each attribute's
>> > > data type."
>> >
>> >
>> > Yeah, I was considering something along those lines as an option as well.
>> > But I’d rather add that wording to the glossary.
>>
>> Although I am still be not convinced if we have to introduce the words
>> "soft error" to the documentation, I don't care it if there are no
>> other
>> opposite opinions.
>
> Attached is a updated patch v3, which is a version that uses the above
> wording instead of "soft error".
>
>> >
>> > > Currently, ON_ERROR doesn't support other soft errors, so it can explain
>> > > it more simply without introducing the new concept, "soft error" to users.
>> > >
>> > >
>> > Good point. Seems we should define what user-facing errors are ignored
>> > anywhere in the system and if we aren’t consistently leveraging these in
>> > all areas/commands make the necessary qualifications in those specific
>> > places.
>> >
>>
>> > > I think "left in a deleted state" is also unclear for users because this
>> > > explains the internal state but not how looks from user's view.How about
>> > > leaving the explanation "These rows will not be visible or accessible" in
>> > > the existing statement?
>> > >
>> >
>> > Just visible then, I don’t like an “or” there and as tuples at least they
>> > are accessible to the system, in vacuum especially. But I expected the
>> > user to understand “as if you deleted it” as their operational concept more
>> > readily than visible. I think this will be read by people who haven’t read
>> > MVCC to fully understand what visible means but know enough to run vacuum
>> > to clean up updated and deleted data as a rule.
>>
>> Ok, I agree we can omit "or accessible". How do you like the
>> followings?
>> Still redundant?
>>
>> "If the command fails, these rows are left in a deleted state;
>> these rows will not be visible, but they still occupy disk space. "
>
> Also, the above statement is used in the patch.

Thanks for updating the patch!

I like your description which doesn't use the word soft error.

Here are minor comments:

> + <literal>ignore</literal> means discard the input row and
> continue with the next one.
> + The default is <literal>stop</literal>

Is "." required at the end of the line?

> An <literal>NOTICE</literal> level context message containing the
> ignored row count is

Should 'An' be 'A'?

Also, I wasn't sure the necessity of 'context'.
It might be possible to just say "A NOTICE message containing the
ignored row count.."
considering below existing descriptions:

doc/src/sgml/pltcl.sgml: a <literal>NOTICE</literal> message each
time a supported command is
doc/src/sgml/pltcl.sgml- executed:

doc/src/sgml/plpgsql.sgml: This example trigger simply raises a
<literal>NOTICE</literal> message
doc/src/sgml/plpgsql.sgml- each time a supported command is
executed.

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2024-02-02 03:26:18 Re: An improvement on parallel DISTINCT
Previous Message Richard Guo 2024-02-02 02:10:10 Re: Oversight in reparameterize_path_by_child leading to executor crash