Re: Add on_error and log_verbosity options to file_fdw

From: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>, sawada(dot)mshk(at)gmail(dot)com
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add on_error and log_verbosity options to file_fdw
Date: 2024-08-08 07:36:02
Message-ID: 79a336ccebc70401625743686c41b81a@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2024-07-24 19:43, torikoshia wrote:
> On 2024-07-23 08:57, Michael Paquier wrote:
>> On Mon, Jul 22, 2024 at 03:07:46PM -0700, Masahiko Sawada wrote:
>>> I'm slightly concerned that users might not want to see the NOTICE
>>> message for every scan. Unlike COPY FROM, scanning a file via
>>> file_fdw
>>> could be frequent.
>
> Agreed.
>
>> Yeah, I also have some concerns regarding the noise that this could
>> produce if called on a foreign table on a regular basis. The verbose
>> mode is disabled by default so I don't see why we should not allow it
>> if the relation owner wants to show it.
>>
>> Perhaps we should first do a silence mode for log_verbosity to skip
>> the NOTICE produced at the end of the COPY FROM summarizing the whole?
>
> I like this idea.
> If there are no objections, I'm going to make a patch for this.

Attached patches.
0001 adds new option 'silent' to log_verbosity and 0002 adds on_error
and log_verbosity options to file_fdw.

> I'm going to continue developing the patch(e.g. add doc, measure
performance degradation) when people also think this feature is worth
adding.

Here is a quick performance test result.

I loaded 1,000,000 rows using pgbench_accounts to a file and counted the
number of rows using file_fdw on different conditions and compared the
execution times on my laptop.

The changed conditions are below:
- source code: HEAD/patch applied
- data: no error data/all row occur data conversion error at the 1st
column
- file_fdw options: on_error=stop/on_error=ignore

There seems no significant difference in performance between HEAD and
the patch applied with on_error option specified as either ignore/stop
when data has no error.
OTOH when all rows occur data conversion error, it is significantly
faster than other cases:

# HAED(e950fe58bd0)
## data:no error

=# create foreign table t1 (a int, b int, c int, t text) server f_fdw
options (filename 'pgb_ac', format 'csv');
=# select count(*) from t1;

1567.569 ms
1675.112 ms
1555.782 ms
1547.676 ms
1660.221 ms

# patch applied
## data:no error, on_error:stop

=# create foreign table t1 (a int, b int, c int, t text) server f_fdw
options (filename 'pgb_ac', format 'csv', on_error 'stop');
=# select count(*) from t1;

1580.656 ms
1623.784 ms
1596.947 ms
1652.307 ms
1613.607 ms

## data:no error, on_error:ignore

=# create foreign table t1 (a int, b int, c int, t text) server f_fdw
options (filename 'pgb_ac', format 'csv', on_error 'ignore');
=# select count(*) from t1;

1575.718 ms
1597.464 ms
1596.540 ms
1665.818 ms
1595.453 ms

#### data:all rows contain error, on_error:ignore

=# create foreign table t1 (a int, b int, c int, t text) server f_fdw
options (filename 'pgb_ac', format 'csv', on_error 'ignore');
=# select count(*) from t1;

914.537 ms
907.506 ms
912.768 ms
913.769 ms
914.327 ms

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

Attachment Content-Type Size
v3-0001-Add-log_verbosity-to-silent.patch text/x-diff 5.4 KB
v3-0002-Add-on_error-and-log_verbosity-options-to-file_fd.patch text/x-diff 6.9 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message cca5507 2024-08-08 07:53:29 Re: Historic snapshot doesn't track txns committed in BUILDING_SNAPSHOT state
Previous Message Nazir Bilal Yavuz 2024-08-08 07:32:16 Using read stream in autoprewarm