Re: Add new COPY option REJECT_LIMIT

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>
Cc: zhjwpku(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Add new COPY option REJECT_LIMIT
Date: 2024-10-03 14:43:25
Message-ID: 8830518a-28ac-43a2-8a11-1676d9a3cdf8@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2024/09/30 21:08, torikoshia wrote:
> Since defGetInt64() checks not only whether the input value exceeds the range of bigint but also input value is specified, attached v6 patch checks them by directly calling defGetInt64().

Thanks for updating the patch! But the patch no longer applied cleanly
to the master branch. Could you rebase it?

+ if (opts_out->reject_limit && !opts_out->on_error)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ /*- translator: first and second %s are the names of COPY
+ * option, e.g. ON_ERROR, third is the value of the COPY option,
+ * e.g. IGNORE */
+ errmsg("COPY %s requires %s to be set to %s",
+ "REJECT_LIMIT", "ON_ERROR", "IGNORE")));

This check might be better moved to other place, for example at
the bottom of ProcessCopyOptions(). I noticed a comment in the code:
"Check for incompatible options (must do these two before inserting defaults)."
You added your condition after this comment and before inserting the defaults,
but it's not related to that process. So, moving this check to other place
seems more appropriate.

While reviewing, I also noticed that the check for
"opts_out->binary && opts_out->on_error != COPY_ON_ERROR_STOP"
is similarly placed before setting the defaults, which might not
be correct. This check should probably be moved as well.
Additionally, the comment mentioning "must do these two" should be
updated to "must do these three." These changes should be handled
in a separate patch.

+ if (cstate->opts.reject_limit &&

Wouldn't it be clearer to use cstate->opts.reject_limit > 0 for better readability?

+ cstate->num_errors > cstate->opts.reject_limit)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+ errmsg("skipped more than REJECT_LIMIT rows: \"%lld\",",
+ (long long) cstate->opts.reject_limit)));

To make the error message clearer, similar to the NOTICE about
skipped rows, how about rewording it to:
"skipped more than REJECT_LIMIT (%lld) rows due to data type incompatibility"?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-10-03 14:46:46 Re: make \d table Collation field showing domains collation if that attribute is type of domain.
Previous Message torikoshia 2024-10-03 14:17:40 Re: Add on_error and log_verbosity options to file_fdw