Re: Make COPY format extendable: Extract COPY TO format implementations

From: Sutou Kouhei <kou(at)clear-code(dot)com>
To: sawada(dot)mshk(at)gmail(dot)com
Cc: zhjwpku(at)gmail(dot)com, michael(at)paquier(dot)xyz, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations
Date: 2025-01-31 23:10:23
Message-ID: 20250201.081023.1138852014934985490.kou@clear-code.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

In <CAD21AoBpWFU4k-_bwrTq0AkFSAdwQqhAsSW188STmu9HxLJ0nQ(at)mail(dot)gmail(dot)com>
"Re: Make COPY format extendable: Extract COPY TO format implementations" on Fri, 31 Jan 2025 14:25:34 -0800,
Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:

> I think that CopyToState and CopyFromState are not APIs but the
> execution states. I'm not against exposing CopyToState and
> CopyFromState. What I'd like to avoid is that we end up adding
> everything (including new fields we add in the future) related to copy
> operation to copyapi.h, leading to include copyapi.h into files that
> are not related to custom format api. fdwapi.h and tsmapi.h as
> examples have only a struct having a bunch of callbacks but not the
> execution state data such as SampScanState are not defined there.

Thanks for sharing examples. But it seems that
fdwapi.h/tsmapi.h (ForeignScanState/SampleScanSate) are not
good examples. It seems that PostgreSQL uses
nodes/execnodes.h for all *ScanState. It seems that the
sparation is not related to *api.h usage.

> My understanding is that we don't strictly prohibit _internal.h from
> being included in out of core files. For example, file_fdw.c includes
> copyfrom_internal.h in order to access some fields of CopyFromState.
>
> If the name with _internal.h is the problem, we can rename them to
> copyfrom.h and copyto.h. It makes sense to me that the code that needs
> to access the internal of the copy execution state include _internal.h
> header, though.

Thanks for sharing the file_fdw.c example. I'm OK with
_internal.h suffix because PostgreSQL doesn't prohibit
_internal.h usage by extensions as you mentioned.

>> > While we get the format routines for custom formats in
>> > ProcessCopyOptionFormat(), we do that for built-in formats in
>> > BeginCopyTo(), which seems odd to me. I think we can have
>> > CopyToGetRoutine() responsible for getting CopyToRoutine for built-in
>> > formats as well as custom format. The same is true for
>> > CopyFromRoutine.
>>
>> I like the current design because we don't need to export
>> CopyToGetBuiltinRoutine() (we can use static for
>> CopyToGetBuiltinRoutine()) but I applied your
>> suggestion. Because it's not a strong opinion.
>
> I meant that ProcessCopyOptionFormat() doesn't not necessarily get the
> routine. An idea is that in ProcessCopyOptionFormat() we just get the
> OID of the handler function, and then set up the format routine in
> BeginCopyTo(). I've attached a patch for this idea (applied on top of
> 0009).

Oh, sorry. I misunderstood your suggestion. I understand
what you suggested by the patch. Thanks.

If we use the approach, we can't show error position when a
custom COPY format handler function returns invalid routine
because DefElem for the "format" option isn't available in
BeginCopyTo(). Is it acceptable? If it's acceptable, let's
use the approach.

> Also, please check some regression test failures on cfbot.

Oh, sorry. I forgot to follow function name change in
0009. I attach the v30 patch set that fixes it in 0009.

Thanks,
--
kou

Attachment Content-Type Size
v30-0001-Refactor-COPY-TO-to-use-format-callback-function.patch text/x-patch 17.9 KB
v30-0002-Refactor-COPY-FROM-to-use-format-callback-functi.patch text/x-patch 32.5 KB
v30-0003-Add-support-for-adding-custom-COPY-TO-format.patch text/x-patch 21.0 KB
v30-0004-Export-CopyToStateData-as-private-data.patch text/x-patch 9.5 KB
v30-0005-Add-support-for-implementing-custom-COPY-TO-form.patch text/x-patch 2.5 KB
v30-0006-Add-support-for-adding-custom-COPY-FROM-format.patch text/x-patch 13.2 KB
v30-0007-Use-COPY_SOURCE_-prefix-for-CopySource-enum-valu.patch text/x-patch 3.5 KB
v30-0008-Add-support-for-implementing-custom-COPY-FROM-fo.patch text/x-patch 2.4 KB
v30-0009-Add-CopyFromSkipErrorRow-for-custom-COPY-format-.patch text/x-patch 10.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2025-01-31 23:21:54 Re: postgresql.conf.sample ordering for IO, worker related GUCs
Previous Message Masahiko Sawada 2025-01-31 22:25:34 Re: Make COPY format extendable: Extract COPY TO format implementations