Re: Add on_error and log_verbosity options to file_fdw

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, sawada(dot)mshk(at)gmail(dot)com, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add on_error and log_verbosity options to file_fdw
Date: 2024-09-30 15:36:11
Message-ID: bbddc7a0-b71a-4d54-9560-c90a26a06c48@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2024/09/26 21:57, torikoshia wrote:
> Updated the patches.

Thanks for updating the patches! I’ve made some changes based on your work, which are attached.
Barring any objections, I'm thinking to push these patches.

For patches 0001 and 0003, I ran pgindent and updated the commit message.

Regarding patch 0002:

- I updated the regression test to run ANALYZE on the file_fdw foreign table
since the on_error option also affects the ANALYZE command. To ensure test
stability, the test now runs ANALYZE with log_verbosity = 'silent'.

- I removed the code that updated the count of skipped rows for
the pg_stat_progress_copy view. As far as I know, file_fdw doesn’t
currently support tracking pg_stat_progress_copy.tuples_processed.
Supporting only tuples_skipped seems inconsistent, so I suggest creating
a separate patch to extend file_fdw to track both tuples_processed and
tuples_skipped in this view.

- I refactored the for-loop handling on_error = 'ignore' in fileIterateForeignScan()
by replacing it with a goto statement for improved readability.

- I modified file_fdw to log a NOTICE message about skipped rows at the end of
ANALYZE if any rows are skipped due to the on_error = 'ignore' setting.

Regarding the "file contains XXX rows" message reported by the ANALYZE VERBOSE
command on the file_fdw foreign table, what number should be reflected in XXX,
especially when some rows are skipped due to on_error = 'ignore'?
Currently, the count only includes valid rows, excluding any skipped rows.
I haven't modified this code yet. Should we instead count all rows
(valid and erroneous) and report that total?

I noticed the code for reporting the number of skipped rows due to
on_error = 'ignore' appears in three places. I’m considering creating
a common function for this reporting to eliminate redundancy but haven’t
implemented it yet.

- I’ve updated the commit message and run pgindent.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Attachment Content-Type Size
v6-0001-Add-log_verbosity-silent-support-to-COPY-command.patch text/plain 6.7 KB
v6-0002-file_fdw-Add-on_error-and-log_verbosity-options-t.patch text/plain 8.6 KB
v6-0003-Refactor-CopyFrom-in-copyfrom.c.patch text/plain 3.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-09-30 15:43:18 Re: ACL_MAINTAIN, Lack of comment content
Previous Message Tom Lane 2024-09-30 15:31:01 Re: pg_verifybackup: TAR format backup verification