Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, "Wei Wang (Fujitsu)" <wangw(dot)fnst(at)fujitsu(dot)com>, "Yu Shi (Fujitsu)" <shiy(dot)fnst(at)fujitsu(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Date: 2023-07-20 02:41:17
Message-ID: CAHut+PvjZ9kC4dx_yeGdP-1oDTY+DYmjRbF8FaRBrdVoF0HT1A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Some review comments for patch v20-0002

======
src/backend/replication/logical/tablesync.c

1. finish_sync_worker
/*
* Exit routine for synchronization worker.
*
* If reuse_worker is false, the worker will not be reused and exit.
*/

~

IMO the "will not be reused" part doesn't need saying -- it is
self-evident from the fact "reuse_worker is false".

SUGGESTION
If reuse_worker is false, at the conclusion of this function the
worker process will exit.

~~~

2. finish_sync_worker

- StartTransactionCommand();
- ereport(LOG,
- (errmsg("logical replication table synchronization worker for
subscription \"%s\", table \"%s\" has finished",
- MySubscription->name,
- get_rel_name(MyLogicalRepWorker->relid))));
- CommitTransactionCommand();
-
/* Find the leader apply worker and signal it. */
logicalrep_worker_wakeup(MyLogicalRepWorker->subid, InvalidOid);

- /* Stop gracefully */
- proc_exit(0);
+ if (!reuse_worker)
+ {
+ StartTransactionCommand();
+ ereport(LOG,
+ (errmsg("logical replication table synchronization worker for
subscription \"%s\" has finished",
+ MySubscription->name)));
+ CommitTransactionCommand();
+
+ /* Stop gracefully */
+ proc_exit(0);
+ }

In the HEAD code the log message came *before* it signalled to the
apply leader. Won't it be better to keep the logic in that same order?

~~~

3. process_syncing_tables_for_sync

- finish_sync_worker();
+ /* Sync worker has completed synchronization of the current table. */
+ MyLogicalRepWorker->is_sync_completed = true;
+
+ ereport(LOG,
+ (errmsg("logical replication table synchronization worker for
subscription \"%s\", relation \"%s\" with relid %u has finished",
+ MySubscription->name,
+ get_rel_name(MyLogicalRepWorker->relid),
+ MyLogicalRepWorker->relid)));
+ CommitTransactionCommand();

IIUC it is only the " table synchronization" part that is finished
here; not the whole "table synchronization worker" (compared to
finish_sync_worker function), so maybe the word "worker" should not
be in this message.

~~~

4. TablesyncWorkerMain

+ if (MyLogicalRepWorker->is_sync_completed)
+ {
+ /* tablesync is done unless a table that needs syncning is found */
+ done = true;

SUGGESTION (Typo "syncning" and minor rewording.)
This tablesync worker is 'done' unless another table that needs
syncing is found.

~

5.
+ /* Found a table for next iteration */
+ finish_sync_worker(true);
+
+ StartTransactionCommand();
+ ereport(LOG,
+ (errmsg("logical replication worker for subscription \"%s\" will be
reused to sync table \"%s\" with relid %u.",
+ MySubscription->name,
+ get_rel_name(MyLogicalRepWorker->relid),
+ MyLogicalRepWorker->relid)));
+ CommitTransactionCommand();
+
+ done = false;
+ break;
+ }
+ LWLockRelease(LogicalRepWorkerLock);

5a.
IMO it seems better to put this ereport *inside* the
finish_sync_worker() function alongside the similar log for when the
worker is not reused.

~

5b.
Isn't there a missing call to that LWLockRelease, if the 'break' happens?

======
src/backend/replication/logical/worker.c

6. LogicalRepApplyLoop

Refer to [1] for my reply to a previous review comment

~~~

7. InitializeLogRepWorker

if (am_tablesync_worker())
ereport(LOG,
- (errmsg("logical replication worker for subscription \"%s\", table
\"%s\" has started",
+ (errmsg("logical replication worker for subscription \"%s\", table
\"%s\" with relid %u has started",
MySubscription->name,
- get_rel_name(MyLogicalRepWorker->relid))));
+ get_rel_name(MyLogicalRepWorker->relid),
+ MyLogicalRepWorker->relid)));

But this is certainly a tablesync worker so the message here should
say "logical replication table synchronization worker" like the HEAD
code used to do.

It seems this mistake was introduced in patch v20-0001.

======
src/include/replication/worker_internal.h

8.
Refer to [1] for my reply to a previous review comment

------
[1] Replies to previous 0002 comments --
https://www.postgresql.org/message-id/CAHut%2BPtiAtGJC52SGNdobOah5ctYDDhWWKd%3DuP%3DrkRgXzg5rdg%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-07-20 03:07:14 Re: Use COPY for populating all pgbench tables
Previous Message Peter Smith 2023-07-20 02:32:06 Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication