From: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> |
---|---|
To: | 'Melih Mutlu' <m(dot)melihmutlu(at)gmail(dot)com> |
Cc: | Peter Smith <smithpb2250(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, "Wei Wang (Fujitsu)" <wangw(dot)fnst(at)fujitsu(dot)com>, "Yu Shi (Fujitsu)" <shiy(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com> |
Subject: | RE: [PATCH] Reuse Workers and Replication Slots during Logical Replication |
Date: | 2023-07-06 09:47:40 |
Message-ID: | TYAPR01MB5866FE76BC31D928D94AB499F52CA@TYAPR01MB5866.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Dear Melih,
> Thanks for the 0003 patch. But it did not work for me. Can you create
> a subscription successfully with patch 0003 applied?
> I get the following error: " ERROR: table copy could not start
> transaction on publisher: another command is already in progress".
You got the ERROR when all the patches (0001-0005) were applied, right?
I have focused on 0001 and 0002 only, so I missed something.
If it was not correct, please attach the logfile and test script what you did.
As you might know, the error is output when the worker executs walrcv_endstreaming()
before doing walrcv_startstreaming().
> I think streaming needs to be ended before moving to another table. So
> I changed the patch a little bit
Your modification seemed not correct. I applied only first three patches (0001-0003), and
executed attached script. Then I got following error on subscriber (attached as N2.log):
> ERROR: could not send end-of-streaming message to primary: no COPY in progress
IIUC the tablesync worker has been already stopped streaming without your modification.
Please see process_syncing_tables_for_sync():
```
if (MyLogicalRepWorker->relstate == SUBREL_STATE_CATCHUP &&
current_lsn >= MyLogicalRepWorker->relstate_lsn)
{
TimeLineID tli;
char syncslotname[NAMEDATALEN] = {0};
char originname[NAMEDATALEN] = {0};
MyLogicalRepWorker->relstate = SUBREL_STATE_SYNCDONE;
...
/*
* End streaming so that LogRepWorkerWalRcvConn can be used to drop
* the slot.
*/
walrcv_endstreaming(LogRepWorkerWalRcvConn, &tli);
```
This means that following changes should not be in the 0003, should be at 0005.
PSA fixed patches.
```
+ /*
+ * If it's already connected to the publisher, end streaming before using
+ * the same connection for another iteration
+ */
+ if (LogRepWorkerWalRcvConn != NULL)
+ {
+ TimeLineID tli;
+ walrcv_endstreaming(LogRepWorkerWalRcvConn, &tli);
+ }
```
Besides, cfbot could not apply your patch set [1]. According to the log, the
bot tried to apply 0004 and 0005 first and got error. IIUC you should assign
same version number within the same mail, like v16-0001, v16-0002,....
[1]: http://cfbot.cputube.org/patch_43_3784.log
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachment | Content-Type | Size |
---|---|---|
N2.log | application/octet-stream | 10.9 KB |
test.sh | application/octet-stream | 1.1 KB |
v16-0001-Refactor-to-split-Apply-and-Tablesync-Workers.patch | application/octet-stream | 21.4 KB |
v16-0002-Reuse-Tablesync-Workers.patch | application/octet-stream | 10.7 KB |
v16-0003-reuse-connection-when-tablesync-workers-change-t.patch | application/octet-stream | 7.3 KB |
v16-0004-Add-replication-protocol-cmd-to-create-a-snapsho.patch | application/octet-stream | 21.1 KB |
v16-0005-Reuse-Replication-Slot-and-Origin-in-Tablesync.patch | application/octet-stream | 55.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Gustafsson | 2023-07-06 09:48:38 | Re: [PATCH] Add loongarch native checksum implementation. |
Previous Message | Peter Eisentraut | 2023-07-06 09:35:41 | Re: EBCDIC sorting as a use case for ICU rules |