Re: Make COPY format extendable: Extract COPY TO format implementations

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Sutou Kouhei <kou(at)clear-code(dot)com>, zhjwpku(at)gmail(dot)com, andrew(at)dunslane(dot)net, nathandbossart(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations
Date: 2023-12-22 01:23:28
Message-ID: CAD21AoDs9cOjuVbA_krGizAdc50KE+FjAuEXWF0NZwbMnc7F3Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 22, 2023 at 10:00 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Thu, Dec 21, 2023 at 06:35:04PM +0900, Sutou Kouhei wrote:
> > * If we just require "copy_to_${FORMAT}(internal)"
> > function and "copy_from_${FORMAT}(internal)" function,
> > we can remove the tricky approach. And it also avoid
> > name collisions with other handler such as tablesample
> > handler.
> > See also:
> > https://www.postgresql.org/message-id/flat/20231214.184414.2179134502876898942.kou%40clear-code.com#af71f364d0a9f5c144e45b447e5c16c9
>
> Hmm. I prefer the unique name approach for the COPY portions without
> enforcing any naming policy on the function names returning the
> handlers, actually, though I can see your point.

Yeah, another idea is to provide support functions to return a
CopyFormatRoutine wrapping either CopyToFormatRoutine or
CopyFromFormatRoutine. For example:

extern CopyFormatRoutine *MakeCopyToFormatRoutine(const
CopyToFormatRoutine *routine);

extensions can do like:

static const CopyToFormatRoutine testfmt_handler = {
.type = T_CopyToFormatRoutine,
.start_fn = testfmt_copyto_start,
.onerow_fn = testfmt_copyto_onerow,
.end_fn = testfmt_copyto_end
};

Datum
copy_testfmt_handler(PG_FUNCTION_ARGS)
{
CopyFormatRoutine *routine = MakeCopyToFormatRoutine(&testfmt_handler);
:

>
> > 2. Need an opaque space like IndexScanDesc::opaque does
> >
> > * A custom COPY TO handler needs to keep its data
>
> Sounds useful to me to have a private area passed down to the
> callbacks.
>

+1

>
> > Questions:
> >
> > 1. What value should be used for "format" in
> > PgMsg_CopyOutResponse message?
> >
> > It's 1 for binary format and 0 for text/csv format.
> >
> > Should we make it customizable by custom COPY TO handler?
> > If so, what value should be used for this?
>
> Interesting point. It looks very tempting to give more flexibility to
> people who'd like to use their own code as we have one byte in the
> protocol but just use 0/1. Hence it feels natural to have a callback
> for that.

+1

>
> It also means that we may want to think harder about copy_is_binary in
> libpq in the future step. Now, having a backend implementation does
> not need any libpq bits, either, because a client stack may just want
> to speak the Postgres protocol directly. Perhaps a custom COPY
> implementation would be OK with how things are in libpq, as well,
> tweaking its way through with just text or binary.
>
> > 2. Do we need more tries for design discussion for the first
> > implementation? If we need, what should we try?
>
> A makeNode() is used with an allocation in the current memory context
> in the function returning the handler. I would have assume that this
> stuff returns a handler as a const struct like table AMs.

+1

The example I mentioned above does that.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2023-12-22 01:48:18 Re: Make COPY format extendable: Extract COPY TO format implementations
Previous Message Michael Paquier 2023-12-22 01:00:24 Re: Make COPY format extendable: Extract COPY TO format implementations