Re: Add reject_limit option to file_fdw

From: Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add reject_limit option to file_fdw
Date: 2024-11-13 14:17:06
Message-ID: 20241113231706.09e5b5ea9640289312835be8@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 13 Nov 2024 21:48:10 +0900
torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:

> On 2024-11-12 14:51, Yugo Nagata wrote:
>
> Thanks for your review!
>
> > On Tue, 12 Nov 2024 10:16:50 +0900
> > torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
> >
> >> On 2024-11-12 01:49, Fujii Masao wrote:
> >> > On 2024/11/11 21:45, torikoshia wrote:
> >> >>> Thanks for adding the comment. It clearly states that REJECT_LIMIT
> >> >>> can be
> >> >>> a single-quoted string. However, it might also be helpful to mention
> >> >>> that
> >> >>> it can be provided as an int64 in the COPY command option. How about
> >> >>> updating it like this?
> >> >>>
> >> >>> ------------------------------------
> >> >>> REJECT_LIMIT can be specified in two ways: as an int64 for the COPY
> >> >>> command
> >> >>> option or as a single-quoted string for the foreign table option
> >> >>> using
> >> >>> file_fdw. Therefore this function needs to handle both formats.
> >> >>> ------------------------------------
> >> >>
> >> >> Thanks! it seems better.
> >> >>
> >> >>
> >> >> Attached v3 patch.
> >> >
> >> > Thanks for updating the patch! It looks like you forgot to attach it,
> >> > though.
> >>
> >> Oops, thanks for pointing it out.
> >> Here it is.
> >
> > I have one minor comment.
> >
> > + ereport(ERROR,
> > + (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
> > + errmsg("skipped more than REJECT_LIMIT (%lld) rows due to data
> > type incompatibility",
> > + (long long) cstate->opts.reject_limit)));
> >
> > Do we have to keep this message consistent with ones of COPY command?
> > With foreign data wrappers, it seems common that the option name is
> > passed in
> > lowercase, so is it better to use "skipped more than reject_limit ..."
> > rather
> > than using uppercase?
>
> Agreed.
> Attached v4 patch.

Thank you for updating the patch.

However, I noticed that file_fdw could output the following error using uppercase
when an invalid value is set to the reject_limit option,

ALTER FOREIGN TABLE agg_bad OPTIONS (ADD reject_limit '-1');
ERROR: REJECT_LIMIT (-1) must be greater than zero

as well as if reject_limit is set when on_error != 'ignore'.

ALTER FOREIGN TABLE agg_bad OPTIONS (ADD reject_limit '1'); -- ERROR
ERROR: COPY REJECT_LIMIT requires ON_ERROR to be set to IGNORE

Also, in the regression tests, I found similar behaviours on existing options,
for example;

CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'xml'); -- ERROR
ERROR: COPY format "xml" not recognized
CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'text', quote ':'); -- ERROR
ERROR: COPY QUOTE requires CSV mode
CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'text', escape ':'); -- ERROR
...

These messages may be unexpected for some users because the documentation of
fild_fdw does not explicitly describe that file_fdw uses COPY internally.
(I can find several wordings like "as COPY", though.)
However, since the current file_fdw already has such messages, I came to think
making the messages on "reject_limit" consistent with COPY is reasonable.
I mean, the previous version of the message seems fine.

As another comment, do we need to add validator test for on_error and
reject_limit as similar to other options?

Regards,
Yugo Nagata

--
Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2024-11-13 14:24:02 Re: proposal: schema variables
Previous Message Jakub Wartak 2024-11-13 13:56:59 Re: pg_combinebackup --incremental