Re: Make COPY format extendable: Extract COPY TO format implementations

From: Junwang Zhao <zhjwpku(at)gmail(dot)com>
To: Sutou Kouhei <kou(at)clear-code(dot)com>
Cc: sawada(dot)mshk(at)gmail(dot)com, michael(at)paquier(dot)xyz, andrew(at)dunslane(dot)net, nathandbossart(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations
Date: 2023-12-15 03:27:30
Message-ID: CAEG8a3JuShA6g19Nt_Ejk15BrNA6PmeCbK7p81izZi71muGq3g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 15, 2023 at 8:53 AM Sutou Kouhei <kou(at)clear-code(dot)com> wrote:
>
> Hi,
>
> In <CAD21AoCZv3cVU+NxR2s9J_dWvjrS350GFFr2vMgCH8wWxQ5hTQ(at)mail(dot)gmail(dot)com>
> "Re: Make COPY format extendable: Extract COPY TO format implementations" on Fri, 15 Dec 2023 05:19:43 +0900,
> Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> > To avoid collisions, extensions can be created in a
> > different schema than public.
>
> Thanks. I didn't notice it.
>
> > And note that built-in format copy handler doesn't need to
> > declare its handler function.
>
> Right. I know it.
>
> > Adding a prefix or suffix would be one option but to give extensions
> > more flexibility, another option would be to support format = 'custom'
> > and add the "handler" option to specify a copy handler function name
> > to call. For example, COPY ... FROM ... WITH (FORMAT = 'custom',
> > HANDLER = 'arrow_copy_handler').
>
I like the prefix/suffix idea, easy to implement. *custom* is not a FORMAT,
and user has to know the name of the specific handler names, not
intuitive.

> Interesting. If we use this option, users can choose an COPY
> FORMAT implementation they like from multiple
> implementations. For example, a developer may implement a
> COPY FROM FORMAT = 'json' handler with PostgreSQL's JSON
> related API and another developer may implement a handler
> with simdjson[1] which is a fast JSON parser. Users can
> choose whichever they like.
Not sure about this, why not move Json copy handler to contrib
as an example for others, any extensions share the same format
function name and just install one? No bound would implement
another CSV or TEXT copy handler IMHO.
>
> But specifying HANDLER = '...' explicitly is a bit
> inconvenient. Because only one handler will be installed in
> most use cases. In the case, users don't need to choose one
> handler.
>
> If we choose this option, it may be better that we also
> provide a mechanism that can work without HANDLER. Searching
> a function by name like tablesample method does is an option.
>
>
> [1]: https://github.com/simdjson/simdjson
>
>
> Thanks,
> --
> kou

--
Regards
Junwang Zhao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-12-15 03:36:50 Re: Improve eviction algorithm in ReorderBuffer
Previous Message Sutou Kouhei 2023-12-15 02:55:18 Re: Make COPY format extendable: Extract COPY TO format implementations