From: | Sutou Kouhei <kou(at)clear-code(dot)com> |
---|---|
To: | zhjwpku(at)gmail(dot)com |
Cc: | nathandbossart(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Make COPY format extendable: Extract COPY TO format implementations |
Date: | 2023-12-07 05:04:58 |
Message-ID: | 20231207.140458.425537343057608813.kou@clear-code.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
In <CAEG8a3LSRhK601Bn50u71BgfNWm4q3kv-o-KEq=hrbyLbY_EsA(at)mail(dot)gmail(dot)com>
"Re: Make COPY format extendable: Extract COPY TO format implementations" on Wed, 6 Dec 2023 22:07:51 +0800,
Junwang Zhao <zhjwpku(at)gmail(dot)com> wrote:
> Should we extract both *copy to* and *copy from* for the first step, in that
> case we can add the pg_copy_handler catalog smoothly later.
I don't object it (mixing TO/FROM changes to one patch) but
it may make review difficult. Is it acceptable?
FYI: I planed that I implement TO part, and then FROM part,
and then unify TO/FROM parts if needed. [1]
> Attached V4 adds 'extract copy from' and it passed the cirrus ci,
> please take a look.
Thanks. Here are my comments:
> + /*
> + * Error is relevant to a particular line.
> + *
> + * If line_buf still contains the correct line, print it.
> + */
> + if (cstate->line_buf_valid)
We need to fix the indentation.
> +CopyFromFormatBinaryStart(CopyFromState cstate, TupleDesc tupDesc)
> +{
> + FmgrInfo *in_functions;
> + Oid *typioparams;
> + Oid in_func_oid;
> + AttrNumber num_phys_attrs;
> +
> + /*
> + * Pick up the required catalog information for each attribute in the
> + * relation, including the input function, the element type (to pass to
> + * the input function), and info about defaults and constraints. (Which
> + * input function we use depends on text/binary format choice.)
> + */
> + num_phys_attrs = tupDesc->natts;
> + in_functions = (FmgrInfo *) palloc(num_phys_attrs * sizeof(FmgrInfo));
> + typioparams = (Oid *) palloc(num_phys_attrs * sizeof(Oid));
We need to update the comment because defaults and
constraints aren't picked up here.
> +CopyFromFormatTextStart(CopyFromState cstate, TupleDesc tupDesc)
...
> + /*
> + * Pick up the required catalog information for each attribute in the
> + * relation, including the input function, the element type (to pass to
> + * the input function), and info about defaults and constraints. (Which
> + * input function we use depends on text/binary format choice.)
> + */
> + in_functions = (FmgrInfo *) palloc(num_phys_attrs * sizeof(FmgrInfo));
> + typioparams = (Oid *) palloc(num_phys_attrs * sizeof(Oid));
ditto.
> @@ -1716,15 +1776,6 @@ BeginCopyFrom(ParseState *pstate,
> ReceiveCopyBinaryHeader(cstate);
> }
I think that this block should be moved to
CopyFromFormatBinaryStart() too. But we need to run it after
we setup inputs such as data_source_cb, pipe and filename...
+/* Routines for a COPY HANDLER implementation. */
+typedef struct CopyHandlerOps
+{
+ /* Called when COPY TO is started. This will send a header. */
+ void (*copy_to_start) (CopyToState cstate, TupleDesc tupDesc);
+
+ /* Copy one row for COPY TO. */
+ void (*copy_to_one_row) (CopyToState cstate, TupleTableSlot *slot);
+
+ /* Called when COPY TO is ended. This will send a trailer. */
+ void (*copy_to_end) (CopyToState cstate);
+
+ void (*copy_from_start) (CopyFromState cstate, TupleDesc tupDesc);
+ bool (*copy_from_next) (CopyFromState cstate, ExprContext *econtext,
+ Datum *values, bool *nulls);
+ void (*copy_from_error_callback) (CopyFromState cstate);
+ void (*copy_from_end) (CopyFromState cstate);
+} CopyHandlerOps;
It seems that "copy_" prefix is redundant. Should we use
"to_start" instead of "copy_to_start" and so on?
BTW, it seems that "COPY FROM (FORMAT json)" may not be implemented. [2]
We may need to care about NULL copy_from_* cases.
> I added a hook *copy_from_end* but this might be removed later if not used.
It may be useful to clean up resources for COPY FROM but the
patch doesn't call the copy_from_end. How about removing it
for now? We can add it and call it from EndCopyFrom() later?
Because it's not needed for now.
I think that we should focus on refactoring instead of
adding a new feature in this patch.
[1]: https://www.postgresql.org/message-id/20231204.153548.2126325458835528809.kou%40clear-code.com
[2]: https://www.postgresql.org/message-id/flat/CALvfUkBxTYy5uWPFVwpk_7ii2zgT07t3d-yR_cy4sfrrLU%3Dkcg%40mail.gmail.com
Thanks,
--
kou
From | Date | Subject | |
---|---|---|---|
Next Message | Dilip Kumar | 2023-12-07 05:11:01 | Re: logical decoding and replication of sequences, take 2 |
Previous Message | Amit Kapila | 2023-12-07 04:56:55 | Re: logical decoding and replication of sequences, take 2 |