Re: Make COPY format extendable: Extract COPY TO format implementations

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Sutou Kouhei <kou(at)clear-code(dot)com>
Cc: michael(at)paquier(dot)xyz, zhjwpku(at)gmail(dot)com, andrew(at)dunslane(dot)net, nathandbossart(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations
Date: 2023-12-15 03:48:17
Message-ID: CAD21AoCwzGh6vY17qgGo58ev6ofhYN5MbSJiW3E5oHeAayTE0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 15, 2023 at 9:53 AM Sutou Kouhei <kou(at)clear-code(dot)com> wrote:
>
> Hi,
>
> In <CAD21AoCZv3cVU+NxR2s9J_dWvjrS350GFFr2vMgCH8wWxQ5hTQ(at)mail(dot)gmail(dot)com>
> "Re: Make COPY format extendable: Extract COPY TO format implementations" on Fri, 15 Dec 2023 05:19:43 +0900,
> Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> > To avoid collisions, extensions can be created in a
> > different schema than public.
>
> Thanks. I didn't notice it.
>
> > And note that built-in format copy handler doesn't need to
> > declare its handler function.
>
> Right. I know it.
>
> > Adding a prefix or suffix would be one option but to give extensions
> > more flexibility, another option would be to support format = 'custom'
> > and add the "handler" option to specify a copy handler function name
> > to call. For example, COPY ... FROM ... WITH (FORMAT = 'custom',
> > HANDLER = 'arrow_copy_handler').
>
> Interesting. If we use this option, users can choose an COPY
> FORMAT implementation they like from multiple
> implementations. For example, a developer may implement a
> COPY FROM FORMAT = 'json' handler with PostgreSQL's JSON
> related API and another developer may implement a handler
> with simdjson[1] which is a fast JSON parser. Users can
> choose whichever they like.
>
> But specifying HANDLER = '...' explicitly is a bit
> inconvenient. Because only one handler will be installed in
> most use cases. In the case, users don't need to choose one
> handler.
>
> If we choose this option, it may be better that we also
> provide a mechanism that can work without HANDLER. Searching
> a function by name like tablesample method does is an option.

Agreed. We can search the function by format name by default and the
user can optionally specify the handler function name in case where
the names of the installed custom copy handler collide. Probably the
handler option stuff could be a follow-up patch.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Sutou Kouhei 2023-12-15 04:45:31 Re: Make COPY format extendable: Extract COPY TO format implementations
Previous Message Amit Kapila 2023-12-15 03:36:50 Re: Improve eviction algorithm in ReorderBuffer