Re: Conflict detection for update_deleted in logical replication

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Conflict detection for update_deleted in logical replication
Date: 2024-09-11 04:18:25
Message-ID: CAJpy0uAGvyspsjanr+Yyemc31moZ1i2EcxgsGwCJV6v-BNZVhQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 10, 2024 at 4:30 PM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Tuesday, September 10, 2024 5:56 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
> > On Tue, Sep 10, 2024 at 1:40 PM Zhijie Hou (Fujitsu) <houzj(dot)fnst(at)fujitsu(dot)com>
> > wrote:
> > >
> > > On Tuesday, September 10, 2024 2:45 PM shveta malik
> > <shveta(dot)malik(at)gmail(dot)com> wrote:
> > > >
> > > > Thank You Hou-San for explaining the design. But to make it easier
> > > > to understand, would you be able to explain the sequence/timeline of
> > > > the
> > > > *new* actions performed by the walsender and the apply processes for
> > > > the given example along with new feedback_slot config needed
> > > >
> > > > Node A: (Procs: walsenderA, applyA)
> > > > T1: INSERT INTO t (id, value) VALUES (1,1); ts=10.00 AM
> > > > T2: DELETE FROM t WHERE id = 1; ts=10.02 AM
> > > >
> > > > Node B: (Procs: walsenderB, applyB)
> > > > T3: UPDATE t SET value = 2 WHERE id = 1; ts=10.01 AM
> > >
> > > Thanks for reviewing! Let me elaborate further on the example:
> > >
> > > On node A, feedback_slots should include the logical slot that used to
> > > replicate changes from Node A to Node B. On node B, feedback_slots
> > > should include the logical slot that replicate changes from Node B to Node A.
> > >
> > > Assume the slot.xmin on Node A has been initialized to a valid
> > > number(740) before the following flow:
> > >
> > > Node A executed T1 - 10.00 AM
> > > T1 replicated and applied on Node B - 10.0001 AM
> > > Node B executed T3 - 10.01 AM
> > > Node A executed T2 (741) - 10.02 AM
> > > T2 replicated and applied on Node B (delete_missing) - 10.03 AM
> >
> > Not related to this feature, but do you mean delete_origin_differ here?
>
> Oh sorry, It's a miss. I meant delete_origin_differ.
>
> >
> > > T3 replicated and applied on Node A (new action, detect
> > update_deleted) - 10.04 AM
> > >
> > > (new action) Apply worker on Node B has confirmed that T2 has been
> > > applied locally and the transactions before T2 (e.g., T3) has been
> > > replicated and applied to Node A (e.g. feedback_slot.confirmed_flush_lsn
> > >= lsn of the local
> > > replayed T2), thus send the new feedback message to Node A.
> > - 10.05 AM
> > >
> > > (new action) Walsender on Node A received the message and would
> > > advance the slot.xmin.- 10.06 AM
> > >
> > > Then, after the slot.xmin is advanced to a number greater than 741,
> > > the VACUUM would be able to remove the dead tuple on Node A.
> > >
> >
> > Thanks for the example. Can you please review below and let me know if my
> > understanding is correct.
> >
> > 1)
> > In a bidirectional replication setup, the user has to create slots in a way that
> > NodeA's sub's slot is Node B's feedback_slot and Node B's sub's slot is Node
> > A's feedback slot. And then only this feature will work well, is it correct to say?
>
> Yes, your understanding is correct.
>
> >
> > 2)
> > Now coming back to multiple feedback_slots in a subscription, is the below
> > correct:
> >
> > Say Node A has publications and subscriptions as follow:
> > ------------------
> > A_pub1
> >
> > A_sub1 (subscribing to B_pub1 with the default slot_name of A_sub1)
> > A_sub2 (subscribing to B_pub2 with the default slot_name of A_sub2)
> > A_sub3 (subscribing to B_pub3 with the default slot_name of A_sub3)
> >
> >
> > Say Node B has publications and subscriptions as follow:
> > ------------------
> > B_sub1 (subscribing to A_pub1 with the default slot_name of B_sub1)
> >
> > B_pub1
> > B_pub2
> > B_pub3
> >
> > Then what will be the feedback_slot configuration for all subscriptions of A and
> > B? Is below correct:
> > ------------------
> > A_sub1, A_sub2, A_sub3: feedback_slots=B_sub1
> > B_sub1: feedback_slots=A_sub1,A_sub2, A_sub3
>
> Right. The above configurations are correct.

Okay. It seems difficult to understand configuration from user's perspective.

> >
> > 3)
> > If the above is true, then do we have a way to make sure that the user has
> > given this configuration exactly the above way? If users end up giving
> > feedback_slots as some random slot (say A_slot4 or incomplete list), do we
> > validate that? (I have not looked at code yet, just trying to understand design
> > first).
>
> The patch doesn't validate if the feedback slots belong to the correct
> subscriptions on remote server. It only validates if the slot is an existing,
> valid, logical slot. I think there are few challenges to validate it further.
> E.g. We need a way to identify the which server the slot is replicating
> changes to, which could be tricky as the slot currently doesn't have any info
> to identify the remote server. Besides, the slot could be inactive temporarily
> due to some subscriber side error, in which case we cannot verify the
> subscription that used it.

Okay, I understand the challenges here.

> >
> > 4)
> > Now coming to this:
> >
> > > The apply worker will get the oldest
> > > confirmed flush LSN among the specified slots and send the LSN as a
> > > feedback message to the walsender.
> >
> > There will be one apply worker on B which will be due to B_sub1, so will it
> > check confirmed_lsn of all slots A_sub1,A_sub2, A_sub3? Won't it be
> > sufficient to check confimed_lsn of say slot A_sub1 alone which has
> > subscribed to table 't' on which delete has been performed? Rest of the lots
> > (A_sub2, A_sub3) might have subscribed to different tables?
>
> I think it's theoretically correct to only check the A_sub1. We could document
> that user can do this by identifying the tables that each subscription
> replicates, but it may not be user friendly.
>

Sorry, I fail to understand how user can identify the tables and give
feedback_slots accordingly? I thought feedback_slots is a one time
configuration when replication is setup (or say setup changes in
future); it can not keep on changing with each query. Or am I missing
something?

IMO, it is something which should be identified internally. Since the
query is on table 't1', feedback-slot which is for 't1' shall be used
to check lsn. But on rethinking,this optimization may not be worth the
effort, the identification part could be tricky, so it might be okay
to check all the slots.

~~

Another query is about 3 node setup. I couldn't figure out what would
be feedback_slots setting when it is not bidirectional, as in consider
the case where there are three nodes A,B,C. Node C is subscribing to
both Node A and Node B. Node A and Node B are the ones doing
concurrent "update" and "delete" which will both be replicated to Node
C. In this case what will be the feedback_slots setting on Node C? We
don't have any slots here which will be replicating changes from Node
C to Node A and Node C to Node B. This is given in [3] in your first
email ([1])

[1]:
https://www.postgresql.org/message-id/OS0PR01MB5716BE80DAEB0EE2A6A5D1F5949D2%40OS0PR01MB5716.jpnprd01.prod.outlook.com

thanks
Shveta

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-09-11 04:35:42 Re: Invalid Assert while validating REPLICA IDENTITY?
Previous Message Richard Guo 2024-09-11 03:43:47 Allow pushdown of HAVING clauses with grouping sets