From: | Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com> |
---|---|
To: | "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, shiy(dot)fnst(at)fujitsu(dot)com |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication |
Date: | 2023-02-01 12:07:25 |
Message-ID: | CAGPVpCT6wG=Eqd2Jc20yO9xNp4S3_XvDPinz2JbaLTtb7M5jHg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
wangw(dot)fnst(at)fujitsu(dot)com <wangw(dot)fnst(at)fujitsu(dot)com>, 31 Oca 2023 Sal, 13:40
tarihinde şunu yazdı:
> Sorry, I forgot to add the link to the email. Please refer to [1].
>
> [1] -
> https://www.postgresql.org/message-id/OSZPR01MB631013C833C98E826B3CFCB9FDC69%40OSZPR01MB6310.jpnprd01.prod.outlook.com
Thanks for pointing out this review. I somehow skipped that, sorry.
Please see attached patches.
shiy(dot)fnst(at)fujitsu(dot)com <shiy(dot)fnst(at)fujitsu(dot)com>, 17 Oca 2023 Sal, 10:46
tarihinde şunu yazdı:
> On Wed, Jan 11, 2023 4:31 PM Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com> wrote:
> 0001 patch
> ============
> 1. walsender.c
> + /* Create a tuple to send consisten WAL location */
>
> "consisten" should be "consistent" I think.
>
Done.
> 2. logical.c
> + if (need_full_snapshot)
> + {
> + LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
> +
> + SpinLockAcquire(&slot->mutex);
> + slot->effective_catalog_xmin = xmin_horizon;
> + slot->data.catalog_xmin = xmin_horizon;
> + slot->effective_xmin = xmin_horizon;
> + SpinLockRelease(&slot->mutex);
> +
> + xmin_horizon =
> GetOldestSafeDecodingTransactionId(!need_full_snapshot);
> + ReplicationSlotsComputeRequiredXmin(true);
> +
> + LWLockRelease(ProcArrayLock);
> + }
>
> It seems that we should first get the safe decoding xid, then inform the
> slot
> machinery about the new limit, right? Otherwise the limit will be
> InvalidTransactionId and that seems inconsistent with the comment.
>
You're right. Moved that call before assigning xmin_horizon.
> 3. doc/src/sgml/protocol.sgml
> + is used in the currenct transaction. This command is currently
> only supported
> + for logical replication.
> + slots.
>
> We don't need to put "slots" in a new line.
>
Done.
> 0002 patch
> ============
> 1.
> In pg_subscription_rel.h, I think the type of "srrelslotname" can be
> changed to
> NameData, see "subslotname" in pg_subscription.h.
>
> 2.
> + * Find the logical replication sync
> worker if exists store
> + * the slot number for dropping associated
> replication slots
> + * later.
>
> Should we add comma after "if exists"?
>
Done.
3.
> + PG_FINALLY();
> + {
> + pfree(cmd.data);
> + }
> + PG_END_TRY();
> + \
> + return tablelist;
> +}
>
> Do we need the backslash?
>
Removed it.
> 4.
> + /*
> + * Advance to the LSN got from walrcv_create_slot. This is WAL
> + * logged for the purpose of recovery. Locks are to prevent the
> + * replication origin from vanishing while advancing.
>
> "walrcv_create_slot" should be changed to
> "walrcv_create_slot/walrcv_slot_snapshot" I think.
Right, done.
>
>
5.
> + /* Replication drop might still exist. Try to drop
> */
> + replorigin_drop_by_name(originname, true, false);
>
> Should "Replication drop" be "Replication origin"?
>
Done.
> 6.
> I saw an assertion failure in the following case, could you please look
> into it?
> The backtrace is attached.
>
> -- pub
> CREATE TABLE tbl1 (a int, b text);
> CREATE TABLE tbl2 (a int primary key, b text);
> create publication pub for table tbl1, tbl2;
> insert into tbl1 values (1, 'a');
> insert into tbl1 values (1, 'a');
>
> -- sub
> CREATE TABLE tbl1 (a int primary key, b text);
> CREATE TABLE tbl2 (a int primary key, b text);
> create subscription sub connection 'dbname=postgres port=5432' publication
> pub;
>
> Subscriber log:
> 2023-01-17 14:47:10.054 CST [1980841] LOG: logical replication apply
> worker for subscription "sub" has started
> 2023-01-17 14:47:10.060 CST [1980843] LOG: logical replication table
> synchronization worker for subscription "sub", table "tbl1" has started
> 2023-01-17 14:47:10.070 CST [1980845] LOG: logical replication table
> synchronization worker for subscription "sub", table "tbl2" has started
> 2023-01-17 14:47:10.073 CST [1980843] ERROR: duplicate key value violates
> unique constraint "tbl1_pkey"
> 2023-01-17 14:47:10.073 CST [1980843] DETAIL: Key (a)=(1) already exists.
> 2023-01-17 14:47:10.073 CST [1980843] CONTEXT: COPY tbl1, line 2
> 2023-01-17 14:47:10.074 CST [1980821] LOG: background worker "logical
> replication worker" (PID 1980843) exited with exit code 1
> 2023-01-17 14:47:10.083 CST [1980845] LOG: logical replication table
> synchronization worker for subscription "sub", table "tbl2" has finished
> 2023-01-17 14:47:10.083 CST [1980845] LOG: logical replication table
> synchronization worker for subscription "sub" has moved to sync table
> "tbl1".
> TRAP: failed Assert("node != InvalidRepOriginId"), File: "origin.c", Line:
> 892, PID: 1980845
>
I'm not able to reproduce this yet. Will look into it further.
Thanks,
--
Melih Mutlu
Microsoft
Attachment | Content-Type | Size |
---|---|---|
v10-0002-Reuse-Logical-Replication-Background-worker.patch | application/octet-stream | 71.4 KB |
v7-0001-Add-replication-protocol-cmd-to-create-a-snapshot.patch | application/octet-stream | 20.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Melih Mutlu | 2023-02-01 12:12:19 | Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication |
Previous Message | Antonin Houska | 2023-02-01 12:02:07 | Re: Privileges on PUBLICATION |