Re: Make COPY format extendable: Extract COPY TO format implementations

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Sutou Kouhei <kou(at)clear-code(dot)com>
Cc: sawada(dot)mshk(at)gmail(dot)com, zhjwpku(at)gmail(dot)com, andrew(at)dunslane(dot)net, nathandbossart(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations
Date: 2024-02-02 08:04:28
Message-ID: ZbyiDHIrxRgzYT99@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 02, 2024 at 04:33:19PM +0900, Sutou Kouhei wrote:
> Hi,
>
> In <ZbyJ60Fd7CHt4m0i(at)paquier(dot)xyz>
> "Re: Make COPY format extendable: Extract COPY TO format implementations" on Fri, 2 Feb 2024 15:21:31 +0900,
> Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> > I have done a review of v10, see v11 attached which is still WIP, with
> > the patches for COPY TO and COPY FROM merged together. Note that I'm
> > thinking to merge them into a single commit.
>
> OK. I don't have a strong opinion for commit unit.
>
> > @@ -74,11 +75,11 @@ typedef struct CopyFormatOptions
> > bool convert_selectively; /* do selective binary conversion? */
> > CopyOnErrorChoice on_error; /* what to do when error happened */
> > List *convert_select; /* list of column names (can be NIL) */
> > + const CopyToRoutine *to_routine; /* callback routines for COPY TO */
> > } CopyFormatOptions;
> >
> > Adding the routines to the structure for the format options is in my
> > opinion incorrect. The elements of this structure are first processed
> > in the option deparsing path, and then we need to use the options to
> > guess which routines we need.
>
> This was discussed with Sawada-san a bit before. [1][2]
>
> [1] https://www.postgresql.org/message-id/flat/CAD21AoBmNiWwrspuedgAPgbAqsn7e7NoZYF6gNnYBf%2BgXEk9Mg%40mail.gmail.com#bfd19262d261c67058fdb8d64e6a723c
> [2] https://www.postgresql.org/message-id/flat/20240130.144531.1257430878438173740.kou%40clear-code.com#fc55392d77f400fc74e42686fe7e348a
>
> I kept the routines in CopyFormatOptions for custom option
> processing. But I should have not cared about it in this
> patch set because this patch set doesn't include custom
> option processing.

One idea I was considering is whether we should use a special value in
the "format" DefElem, say "custom:$my_custom_format" where it would be
possible to bypass the formay check when processing options and find
the routines after processing all the options. I'm not wedded to
that, but attaching the routines to the state data is IMO the correct
thing, because this has nothing to do with CopyFormatOptions.

> So I'm OK that we move the routines to
> Copy{From,To}StateData.

Okay.

>> copyapi.h needs more documentation, like what is expected for
>> extension developers when using these, what are the arguments, etc. I
>> have added what I had in mind for now.
>
> Thanks! I'm not good at writing documentation in English...

No worries.

> I'm OK with the approach. But how about adding the extra
> callbacks to Copy{From,To}StateData not
> Copy{From,To}Routines like CopyToStateData::data_dest_cb and
> CopyFromStateData::data_source_cb? They are only needed for
> "text" and "csv". So we don't need to add them to
> Copy{From,To}Routines to keep required callback minimum.

And set them in cstate while we are in the Start routine, right? Hmm.
Why not.. That would get rid of the multiples layers v11 has, which
is my pain point, and we have many fields in cstate that are already
used on a per-format basis.

> What is the better next action for us? Do you want to
> complete the WIP v11 patch set by yourself (and commit it)?
> Or should I take over it?

I was planning to work on that, but wanted to be sure how you felt
about the problem with text and csv first.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-02-02 08:10:55 Re: Synchronizing slots from primary to standby
Previous Message Masahiko Sawada 2024-02-02 07:53:37 Re: Improve eviction algorithm in ReorderBuffer