Re: Make COPY format extendable: Extract COPY TO format implementations

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Junwang Zhao <zhjwpku(at)gmail(dot)com>
Cc: Sutou Kouhei <kou(at)clear-code(dot)com>, andrew(at)dunslane(dot)net, michael(at)paquier(dot)xyz, nathandbossart(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations
Date: 2024-01-29 03:21:48
Message-ID: CAD21AoDjrMWf4bmqGGt8py3xEs98JPUH0ePBD2LV5D4Ts5JpYQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 29, 2024 at 12:10 PM Junwang Zhao <zhjwpku(at)gmail(dot)com> wrote:
>
> On Mon, Jan 29, 2024 at 10:42 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Fri, Jan 26, 2024 at 6:02 PM Junwang Zhao <zhjwpku(at)gmail(dot)com> wrote:
> > >
> > > On Fri, Jan 26, 2024 at 4:55 PM Sutou Kouhei <kou(at)clear-code(dot)com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > In <CAEG8a3KhS6s1XQgDSvc8vFTb4GkhBmS8TxOoVSDPFX+MPExxxQ(at)mail(dot)gmail(dot)com>
> > > > "Re: Make COPY format extendable: Extract COPY TO format implementations" on Fri, 26 Jan 2024 16:41:50 +0800,
> > > > Junwang Zhao <zhjwpku(at)gmail(dot)com> wrote:
> > > >
> > > > > CopyToProcessOption()/CopyFromProcessOption() can only handle
> > > > > single option, and store the options in the opaque field, but it can not
> > > > > check the relation of two options, for example, considering json format,
> > > > > the `header` option can not be handled by these two functions.
> > > > >
> > > > > I want to find a way when the user specifies the header option, customer
> > > > > handler can error out.
> > > >
> > > > Ah, you want to use a built-in option (such as "header")
> > > > value from a custom handler, right? Hmm, it may be better
> > > > that we call CopyToProcessOption()/CopyFromProcessOption()
> > > > for all options including built-in options.
> > > >
> > > Hmm, still I don't think it can handle all cases, since we don't know
> > > the sequence of the options, we need all the options been parsed
> > > before we check the compatibility of the options, or customer
> > > handlers will need complicated logic to resolve that, which might
> > > lead to ugly code :(
> > >
> >
> > Does it make sense to pass only non-builtin options to the custom
> > format callback after parsing and evaluating the builtin options? That
> > is, we parse and evaluate only the builtin options and populate
> > opts_out first, then pass each rest option to the custom format
> > handler callback. The callback can refer to the builtin option values.
>
> Yeah, I think this makes sense.
>
> > The callback is expected to return false if the passed option is not
> > supported. If one of the builtin formats is specified and the rest
> > options list has at least one option, we raise "option %s not
> > recognized" error. IOW it's the core's responsibility to ranse the
> > "option %s not recognized" error, which is in order to raise a
> > consistent error message. Also, I think the core should check the
> > redundant options including bultiin and custom options.
>
> It would be good that core could check all the redundant options,
> but where should core do the book-keeping of all the options? I have
> no idea about this, in my implementation of pg_copy_json extension,
> I handle redundant options by adding a xxx_specified field for each
> xxx.

What I imagined is that while parsing the all specified options, we
evaluate builtin options and we add non-builtin options to another
list. Then when parsing a non-builtin option, we check if this option
already exists in the list. If there is, we raise the "option %s not
recognized" error.". Once we complete checking all options, we pass
each option in the list to the callback.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Will Mortensen 2024-01-29 03:28:05 Re: Exposing the lock manager's WaitForLockers() to SQL
Previous Message vignesh C 2024-01-29 03:19:53 Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500