RE: [POC] Fast COPY FROM command for the table with foreign partitions

From: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
To: 'Andrey Lepikhov' <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, Alexey Kondratov <a(dot)kondratov(at)postgrespro(dot)ru>, Michael Paquier <michael(at)paquier(dot)xyz>, Ashutosh Bapat <ashutosh(dot)bapat(at)2ndquadrant(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: [POC] Fast COPY FROM command for the table with foreign partitions
Date: 2020-10-19 04:12:25
Message-ID: TYAPR01MB29902ABAE555355D9B90CC84FE1E0@TYAPR01MB2990.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Andrey-san,

Thank you for challenging an interesting feature. Below are my review comments.

(1)
- /* for use by copy.c when performing multi-inserts */
+ /*
+ * The following fields are currently only relevant to copy.c.
+ *
+ * True if okay to use multi-insert on this relation
+ */
+ bool ri_usesMultiInsert;
+
+ /* Buffer allocated to this relation when using multi-insert mode */
struct CopyMultiInsertBuffer *ri_CopyMultiInsertBuffer;
} ResultRelInfo;

It's better to place the new bool member next to an existing bool member, so that the structure doesn't get larger.

(2)
+ Assert(rri->ri_usesMultiInsert == false);

As the above assertion represents, I'm afraid the semantics of ExecRelationAllowsMultiInsert() and ResultRelInfo->ri_usesMultiInsert are unclear. In CopyFrom(), ri_usesMultiInsert is set by also considering the COPY-specific conditions:

+ if (!cstate->volatile_defexprs &&
+ !contain_volatile_functions(cstate->whereClause) &&
+ ExecRelationAllowsMultiInsert(target_resultRelInfo, NULL))
+ target_resultRelInfo->ri_usesMultiInsert = true;

On the other hand, in below ExecInitPartitionInfo(), ri_usesMultiInsert is set purely based on the relation's characteristics.

+ leaf_part_rri->ri_usesMultiInsert =
+ ExecRelationAllowsMultiInsert(leaf_part_rri, rootResultRelInfo);

In addition to these differences, I think it's a bit confusing that the function itself doesn't record the check result in ri_usesMultiInsert.

It's probably easy to understand to not add ri_usesMultiInsert, and the function just encapsulates the check logic based solely on the relation characteristics and returns the result. So, the argument is just one ResultRelInfo. The caller (e.g. COPY) combines the function result with other specific conditions.

(3)
+typedef void (*BeginForeignCopy_function) (ModifyTableState *mtstate,
+ ResultRelInfo *rinfo);
+
+typedef void (*EndForeignCopy_function) (EState *estate,
+ ResultRelInfo *rinfo);
+
+typedef void (*ExecForeignCopy_function) (ResultRelInfo *rinfo,
+ TupleTableSlot **slots,
+ int nslots);

To align with other function groups, it's better to place the functions in order of Begin, Exec, and End.

(4)
+ /* COPY a bulk of tuples into a foreign relation */
+ BeginForeignCopy_function BeginForeignCopy;
+ EndForeignCopy_function EndForeignCopy;
+ ExecForeignCopy_function ExecForeignCopy;

To align with the other functions' comment, the comment should be:
/* Support functions for COPY */

(5)
+<programlisting>
+TupleTableSlot *
+ExecForeignCopy(ResultRelInfo *rinfo,
+ TupleTableSlot **slots,
+ int nslots);
+</programlisting>
+
+ Copy a bulk of tuples into the foreign table.
+ <literal>estate</literal> is global execution state for the query.

The return type is void.

(6)
+ <literal>nslots</literal> cis a number of tuples in the <literal>slots</literal>

cis -> is

(7)
+ <para>
+ If the <function>ExecForeignCopy</function> pointer is set to
+ <literal>NULL</literal>, attempts to insert into the foreign table will fail
+ with an error message.
+ </para>

"attempts to insert into" should be "attempts to run COPY on", because it's used for COPY.
Furthermore, if ExecForeignCopy is NULL, COPY should use ExecForeignInsert() instead, right? Otherwise, existing FDWs would become unable to be used for COPY.

(8)
+ bool pipe = (filename == NULL) && (data_dest_cb == NULL);

The above pipe in BeginCopyTo() is changed to not match pipe in DoCopyTo(), which only refers to filename. Should pipe in DoCopyTo() also be changed? If no, the use of the same variable name for different conditions is confusing.

(9)
- * partitions will be done later.
+- * partitions will be done later.

This is an unintended addition of '-'?

(10)
- if (resultRelInfo->ri_FdwRoutine != NULL &&
- resultRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL)
- resultRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate,
- resultRelInfo);
+ if (target_resultRelInfo->ri_FdwRoutine != NULL)
+ {
+ if (target_resultRelInfo->ri_usesMultiInsert)
+ target_resultRelInfo->ri_FdwRoutine->BeginForeignCopy(mtstate,
+ resultRelInfo);
+ else if (target_resultRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL)
+ target_resultRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate,
+ resultRelInfo);
+ }

BeginForeignCopy() should be called if it's defined, because BeginForeignCopy() is an optional function.

(11)
+ oldcontext = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
+
+ table_multi_insert(resultRelInfo->ri_RelationDesc,
+ slots,

The extra empty line seems unintended.

(12)
@@ -585,7 +583,8 @@ CopySendEndOfRow(CopyState cstate)
(void) pq_putmessage('d', fe_msgbuf->data, fe_msgbuf->len);
break;
case COPY_CALLBACK:
- Assert(false); /* Not yet supported. */
+ CopySendChar(cstate, '\n');
+ cstate->data_dest_cb(fe_msgbuf->data, fe_msgbuf->len);

As in the COPY_FILENAME case, shouldn't the line terminator be sent only in text format, and be changed to \r\n on Windows? I'm asking this as I'm probably a bit confused about in what situation COPY_CALLBACK could be used. I thought the binary format and \r\n line terminator could be necessary depending on the FDW implementation.

(13)
@@ -1001,9 +1001,13 @@ ExecInitRoutingInfo(ModifyTableState *mtstate,
* If the partition is a foreign table, let the FDW init itself for
* routing tuples to the partition.
*/
- if (partRelInfo->ri_FdwRoutine != NULL &&
- partRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL)
- partRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate, partRelInfo);
+ if (partRelInfo->ri_FdwRoutine != NULL)
+ {
+ if (partRelInfo->ri_usesMultiInsert)
+ partRelInfo->ri_FdwRoutine->BeginForeignCopy(mtstate, partRelInfo);
+ else if (partRelInfo->ri_FdwRoutine->BeginForeignInsert != NULL)
+ partRelInfo->ri_FdwRoutine->BeginForeignInsert(mtstate, partRelInfo);
+ }

BeginForeignCopy() should be called only if it's defined, because BeginForeignCopy() is an optional function.

(14)
@@ -1205,10 +1209,18 @@ ExecCleanupTupleRouting(ModifyTableState *mtstate,
ResultRelInfo *resultRelInfo = proute->partitions[i];

/* Allow any FDWs to shut down */
- if (resultRelInfo->ri_FdwRoutine != NULL &&
- resultRelInfo->ri_FdwRoutine->EndForeignInsert != NULL)
- resultRelInfo->ri_FdwRoutine->EndForeignInsert(mtstate->ps.state,
- resultRelInfo);
+ if (resultRelInfo->ri_FdwRoutine != NULL)
+ {
+ if (resultRelInfo->ri_usesMultiInsert)
+ {
+ Assert(resultRelInfo->ri_FdwRoutine->EndForeignCopy != NULL);
+ resultRelInfo->ri_FdwRoutine->EndForeignCopy(mtstate->ps.state,
+ resultRelInfo);
+ }
+ else if (resultRelInfo->ri_FdwRoutine->EndForeignInsert != NULL)
+ resultRelInfo->ri_FdwRoutine->EndForeignInsert(mtstate->ps.state,
+ resultRelInfo);
+ }

EndForeignCopy() is an optional function, isn't it? That is, it's called if it's defined.

(15)
+static void
+pgfdw_copy_dest_cb(void *buf, int len)
+{
+ PGconn *conn = copy_fmstate->conn;
+
+ if (PQputCopyData(conn, (char *) buf, len) <= 0)
+ {
+ PGresult *res = PQgetResult(conn);
+
+ pgfdw_report_error(ERROR, res, conn, true, copy_fmstate->query);
+ }
+}

The following page says "Use PQerrorMessage to retrieve details if the return value is -1." So, it's correct to not use PGresult here and pass NULL as the second argument to pgfdw_report_error().

https://www.postgresql.org/docs/devel/libpq-copy.html

(16)
+ for (i = 0; i < nslots; i++)
+ CopyOneRowTo(fmstate->cstate, slots[i]);
+
+ status = true;
+ }

I'm afraid it's not intuitive what "status is true" means. I think copy_data_sent or copy_send_success would be better for the variable name.

(17)
+ if (PQputCopyEnd(conn, status ? NULL : _("canceled by server")) <= 0 ||
+ PQflush(conn))
+ ereport(ERROR,
+ (errmsg("error returned by PQputCopyEnd: %s",
+ PQerrorMessage(conn))));

As the places that call PQsendQuery(), it seems preferrable to call pgfdw_report_error() here too.

Regards
Takayuki Tsunakawa

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ian Lawrence Barwick 2020-10-19 04:18:07 [patch] ENUM errdetail should mention bytes, not chars
Previous Message Tang, Haiying 2020-10-19 03:57:00 RE: Use of "long" in incremental sort code