Re: Perform streaming logical transactions by background workers and parallel apply

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2022-07-26 09:56:07
Message-ID: CAHut+Ps-GR3v1NQcbVYzRsE_LLQL9F73Tv_A6sFTwry7+MOe7g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here are some review comment for patch v19-0001:

======

1.1 Missing docs for protocol version

Since you bumped the logical replication protocol version for this
patch, now there is some missing documentation to describe this new
protocol version. e.g. See here [1]

======

1.2 doc/src/sgml/config.sgml

+ <para>
+ Maximum number of apply background workers per subscription. This
+ parameter controls the amount of parallelism of the streaming of
+ in-progress transactions when subscription parameter
+ <literal>streaming = parallel</literal>.
+ </para>

SUGGESTION
Maximum number of apply background workers per subscription. This
parameter controls the amount of parallelism for streaming of
in-progress transactions with subscription parameter
<literal>streaming = parallel</literal>.

======

1.3 src/sgml/protocol.sgml

@@ -6809,6 +6809,25 @@ psql "dbname=postgres replication=database" -c
"IDENTIFY_SYSTEM;"
</listitem>
</varlistentry>

+ <varlistentry>
+ <term>Int64 (XLogRecPtr)</term>
+ <listitem>
+ <para>
+ The LSN of the abort.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Int64 (TimestampTz)</term>
+ <listitem>
+ <para>
+ Abort timestamp of the transaction. The value is in number
+ of microseconds since PostgreSQL epoch (2000-01-01).
+ </para>
+ </listitem>
+ </varlistentry>

There are missing notes on these new fields. They both should says
something like "This field is available since protocol version 4."
(See similar examples on the same docs page)

======

1.4 src/backend/replication/logical/applybgworker.c - apply_bgworker_start

Previously [1] I wrote:
> The Assert that the entries in the free-list are FINISHED seems like
> unnecessary checking. IIUC, code is already doing the Assert that
> entries are FINISHED before allowing them into the free-list in the
> first place.

IMO this Assert just causes unnecessary doubts, but if you really want
to keep it then I think it belongs logically *above* the
list_delete_last.

~~~

1.5 src/backend/replication/logical/applybgworker.c - apply_bgworker_start

+ server_version = walrcv_server_version(LogRepWorkerWalRcvConn);
+ wstate->shared->server_version =
+ server_version >= 160000 ? LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM :
+ server_version >= 150000 ? LOGICALREP_PROTO_TWOPHASE_VERSION_NUM :
+ server_version >= 140000 ? LOGICALREP_PROTO_STREAM_VERSION_NUM :
+ LOGICALREP_PROTO_VERSION_NUM;

It makes no sense to assign a protocol version to a server_version.
Perhaps it is just a simple matter of a poorly named struct member.
e.g Maybe everything is OK if it said something like
wstate->shared->proto_version.

~~~

1.6 src/backend/replication/logical/applybgworker.c - LogicalApplyBgwLoop

+/* Apply Background Worker main loop */
+static void
+LogicalApplyBgwLoop(shm_mq_handle *mqh, volatile ApplyBgworkerShared *shared)

'shared' seems a very vague param name. Maybe can be 'bgw_shared' or
'parallel_shared' or something better?

~~~

1.7 src/backend/replication/logical/applybgworker.c - ApplyBgworkerMain

+/*
+ * Apply Background Worker entry point
+ */
+void
+ApplyBgworkerMain(Datum main_arg)
+{
+ volatile ApplyBgworkerShared *shared;

'shared' seems a very vague var name. Maybe can be 'bgw_shared' or
'parallel_shared' or something better?

~~~

1.8 src/backend/replication/logical/applybgworker.c - apply_bgworker_setup_dsm

+static void
+apply_bgworker_setup_dsm(ApplyBgworkerState *wstate)
+{
+ shm_toc_estimator e;
+ Size segsize;
+ dsm_segment *seg;
+ shm_toc *toc;
+ ApplyBgworkerShared *shared;
+ shm_mq *mq;

'shared' seems a very vague var name. Maybe can be 'bgw_shared' or
'parallel_shared' or something better?

~~~

1.9 src/backend/replication/logical/applybgworker.c - apply_bgworker_setup_dsm

+ server_version = walrcv_server_version(LogRepWorkerWalRcvConn);
+ shared->server_version =
+ server_version >= 160000 ? LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM :
+ server_version >= 150000 ? LOGICALREP_PROTO_TWOPHASE_VERSION_NUM :
+ server_version >= 140000 ? LOGICALREP_PROTO_STREAM_VERSION_NUM :
+ LOGICALREP_PROTO_VERSION_NUM;

Same as earlier review comment #1.5

======

1.10 src/backend/replication/logical/worker.c

@@ -22,8 +22,28 @@
* STREAMED TRANSACTIONS
* ---------------------
* Streamed transactions (large transactions exceeding a memory limit on the
- * upstream) are not applied immediately, but instead, the data is written
- * to temporary files and then applied at once when the final commit arrives.
+ * upstream) are applied using one of two approaches.
+ *
+ * 1) Separate background workers

"two approaches." --> "two approaches:"

~~~

1.11 src/backend/replication/logical/worker.c - apply_handle_stream_abort

+ /* Check whether the publisher sends abort_lsn and abort_time. */
+ if (am_apply_bgworker())
+ read_abort_lsn = MyParallelShared->server_version >=
+ LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM;

IMO makes no sense to compare a server version with a protocol
version. Same as review comment #1.5

======

1.12 src/include/replication/worker_internal.h

+typedef struct ApplyBgworkerShared
+{
+ slock_t mutex;
+
+ /* Status of apply background worker. */
+ ApplyBgworkerStatus status;
+
+ /* server version of publisher. */
+ uint32 server_version;
+
+ TransactionId stream_xid;
+ uint32 n; /* id of apply background worker */
+} ApplyBgworkerShared;

AFAICT you only ever used 'server_version' for storing the *protocol*
version, so really this member should be called something like
'proto_version'. Please see earlier review comment #1.5 and others.

------
[1] https://www.postgresql.org/message-id/CAHut%2BPvN7fwtUE%3DbidzrsOUXSt%2BJpnkJztZ-Jn5t86moofaZ6g%40mail.gmail.com
[2] https://www.postgresql.org/docs/devel/protocol-logical-replication.html

Kind Reagrds,
Peter Smith.
Fujitsu Australia.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2022-07-26 10:00:57 Re: SI-read predicate locks on materialized views
Previous Message 王伟 (学弈) 2022-07-26 09:51:07 回复:Re: PANIC: wrong buffer passed to visibilitymap_clear