From: | torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz>, sawada(dot)mshk(at)gmail(dot)com |
Cc: | Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Add on_error and log_verbosity options to file_fdw |
Date: | 2024-08-08 07:36:02 |
Message-ID: | 79a336ccebc70401625743686c41b81a@oss.nttdata.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2024-07-24 19:43, torikoshia wrote:
> On 2024-07-23 08:57, Michael Paquier wrote:
>> On Mon, Jul 22, 2024 at 03:07:46PM -0700, Masahiko Sawada wrote:
>>> I'm slightly concerned that users might not want to see the NOTICE
>>> message for every scan. Unlike COPY FROM, scanning a file via
>>> file_fdw
>>> could be frequent.
>
> Agreed.
>
>> Yeah, I also have some concerns regarding the noise that this could
>> produce if called on a foreign table on a regular basis. The verbose
>> mode is disabled by default so I don't see why we should not allow it
>> if the relation owner wants to show it.
>>
>> Perhaps we should first do a silence mode for log_verbosity to skip
>> the NOTICE produced at the end of the COPY FROM summarizing the whole?
>
> I like this idea.
> If there are no objections, I'm going to make a patch for this.
Attached patches.
0001 adds new option 'silent' to log_verbosity and 0002 adds on_error
and log_verbosity options to file_fdw.
> I'm going to continue developing the patch(e.g. add doc, measure
performance degradation) when people also think this feature is worth
adding.
Here is a quick performance test result.
I loaded 1,000,000 rows using pgbench_accounts to a file and counted the
number of rows using file_fdw on different conditions and compared the
execution times on my laptop.
The changed conditions are below:
- source code: HEAD/patch applied
- data: no error data/all row occur data conversion error at the 1st
column
- file_fdw options: on_error=stop/on_error=ignore
There seems no significant difference in performance between HEAD and
the patch applied with on_error option specified as either ignore/stop
when data has no error.
OTOH when all rows occur data conversion error, it is significantly
faster than other cases:
# HAED(e950fe58bd0)
## data:no error
=# create foreign table t1 (a int, b int, c int, t text) server f_fdw
options (filename 'pgb_ac', format 'csv');
=# select count(*) from t1;
1567.569 ms
1675.112 ms
1555.782 ms
1547.676 ms
1660.221 ms
# patch applied
## data:no error, on_error:stop
=# create foreign table t1 (a int, b int, c int, t text) server f_fdw
options (filename 'pgb_ac', format 'csv', on_error 'stop');
=# select count(*) from t1;
1580.656 ms
1623.784 ms
1596.947 ms
1652.307 ms
1613.607 ms
## data:no error, on_error:ignore
=# create foreign table t1 (a int, b int, c int, t text) server f_fdw
options (filename 'pgb_ac', format 'csv', on_error 'ignore');
=# select count(*) from t1;
1575.718 ms
1597.464 ms
1596.540 ms
1665.818 ms
1595.453 ms
#### data:all rows contain error, on_error:ignore
=# create foreign table t1 (a int, b int, c int, t text) server f_fdw
options (filename 'pgb_ac', format 'csv', on_error 'ignore');
=# select count(*) from t1;
914.537 ms
907.506 ms
912.768 ms
913.769 ms
914.327 ms
--
Regards,
--
Atsushi Torikoshi
NTT DATA Group Corporation
Attachment | Content-Type | Size |
---|---|---|
v3-0001-Add-log_verbosity-to-silent.patch | text/x-diff | 5.4 KB |
v3-0002-Add-on_error-and-log_verbosity-options-to-file_fd.patch | text/x-diff | 6.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | cca5507 | 2024-08-08 07:53:29 | Re: Historic snapshot doesn't track txns committed in BUILDING_SNAPSHOT state |
Previous Message | Nazir Bilal Yavuz | 2024-08-08 07:32:16 | Using read stream in autoprewarm |