From: | Sutou Kouhei <kou(at)clear-code(dot)com> |
---|---|
To: | michael(at)paquier(dot)xyz |
Cc: | sawada(dot)mshk(at)gmail(dot)com, zhjwpku(at)gmail(dot)com, andrew(at)dunslane(dot)net, nathandbossart(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Make COPY format extendable: Extract COPY TO format implementations |
Date: | 2024-01-31 05:11:22 |
Message-ID: | 20240131.141122.279551156957581322.kou@clear-code.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
In <Zbi1TwPfAvUpKqTd(at)paquier(dot)xyz>
"Re: Make COPY format extendable: Extract COPY TO format implementations" on Tue, 30 Jan 2024 17:37:35 +0900,
Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>> We use defel->location for an error message. (We don't need
>> to set location for the default "text" DefElem.)
>
> Yeah, but you should not need to have this error in the paths that set
> the callback routines in opts_out if the same validation happens a few
> lines before, in copy.c.
Ah, yes. defel->location is used in later patches. For
example, it's used when a COPY handler for the specified
FORMAT isn't found.
> I am really worried about the complexities
> this thread is getting into because we are trying to shape the
> callbacks in the most generic way possible based on *two* use cases.
> This is going to be a never-ending discussion. I'd rather get some
> simple basics, and then we can discuss if tweaking the callbacks is
> really necessary or not. Even after introducing the pg_proc lookups
> to get custom callbacks.
I understand your concern. Let's introduce minimal callbacks
as the first step. I think that we completed our design
discussion for this feature. We can choose minimal callbacks
based on the discussion.
> The custom options don't seem like an absolute strong
> requirement for the first shot with the callbacks or even the
> possibility to retrieve the callbacks from a function call. I mean,
> you could provide some control with SET commands and a few GUCs, at
> least, even if that would be strange. Manipulations with a list of
> DefElems is the intuitive way to have custom options at query level,
> but we also have to guess the set of callbacks from this list of
> DefElems coming from the query. You see my point, I am not sure
> if it would be the best thing to process twice the options, especially
> when it comes to decide if a DefElem should be valid or not depending
> on the callbacks used. Or we could use a kind of "special" DefElem
> where we could store a set of key:value fed to a callback :)
Interesting. Let's remove custom options support from the
initial minimal callbacks.
>> Can I prepare the v10 patch set as "the v7 patch set" + "the
>> removal of the "if" checks in NextCopyFromRawFields()"?
>> (+ reverting opts_out->{csv_mode,binary} changes in
>> ProcessCopyOptions().)
>
> Yep, if I got it that would make sense to me. If you can do that,
> that would help quite a bit. :)
I've prepared the v10 patch set. Could you try this?
Changes since the v7 patch set:
0001:
* Remove CopyToProcessOption() callback
* Remove CopyToGetFormat() callback
* Revert passing CopyToState to ProcessCopyOptions()
* Revert moving "opts_out->{csv_mode,binary} = true" to
ProcessCopyOptionFormatTo()
* Change to receive "const char *format" instead "DefElem *defel"
by ProcessCopyOptionFormatTo()
0002:
* Remove CopyFromProcessOption() callback
* Remove CopyFromGetFormat() callback
* Change to receive "const char *format" instead "DefElem
*defel" by ProcessCopyOptionFormatFrom()
* Remove "if (cstate->opts.csv_mode)" branches from
NextCopyFromRawFields()
FYI: Here are Copy{From,To}Routine in the v10 patch set. I
think that only Copy{From,To}OneRow are minimal callbacks
for the performance gain. But can we keep Copy{From,To}Start
and Copy{From,To}End for consistency? We can remove a few
{csv_mode,binary} conditions by Copy{From,To}{Start,End}. It
doesn't depend on the number of COPY target tuples. So they
will not affect performance.
/* Routines for a COPY FROM format implementation. */
typedef struct CopyFromRoutine
{
/*
* Called when COPY FROM is started. This will initialize something and
* receive a header.
*/
void (*CopyFromStart) (CopyFromState cstate, TupleDesc tupDesc);
/* Copy one row. It returns false if no more tuples. */
bool (*CopyFromOneRow) (CopyFromState cstate, ExprContext *econtext, Datum *values, bool *nulls);
/* Called when COPY FROM is ended. This will finalize something. */
void (*CopyFromEnd) (CopyFromState cstate);
} CopyFromRoutine;
/* Routines for a COPY TO format implementation. */
typedef struct CopyToRoutine
{
/* Called when COPY TO is started. This will send a header. */
void (*CopyToStart) (CopyToState cstate, TupleDesc tupDesc);
/* Copy one row for COPY TO. */
void (*CopyToOneRow) (CopyToState cstate, TupleTableSlot *slot);
/* Called when COPY TO is ended. This will send a trailer. */
void (*CopyToEnd) (CopyToState cstate);
} CopyToRoutine;
Thanks,
--
kou
Attachment | Content-Type | Size |
---|---|---|
v10-0001-Extract-COPY-TO-format-implementations.patch | text/x-patch | 19.6 KB |
v10-0002-Extract-COPY-FROM-format-implementations.patch | text/x-patch | 26.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Hayato Kuroda (Fujitsu) | 2024-01-31 05:18:27 | RE: Improve eviction algorithm in ReorderBuffer |
Previous Message | Peter Smith | 2024-01-31 05:10:16 | Re: Synchronizing slots from primary to standby |