Re: logical decoding and replication of sequences, take 2

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Subject: Re: logical decoding and replication of sequences, take 2
Date: 2023-07-25 10:20:19
Message-ID: CAA4eK1KqFtjUnLkY7-qGDnGqBYSdpnCw3Xo4HjRw=CvqkemBfg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 24, 2023 at 4:22 PM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>
> On 7/24/23 12:40, Amit Kapila wrote:
> > On Wed, Jul 5, 2023 at 8:21 PM Ashutosh Bapat
> > <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> >>
> >> 0005, 0006 and 0007 are all related to the initial sequence sync. [3]
> >> resulted in 0007 and I think we need it. That leaves 0005 and 0006 to
> >> be reviewed in this response.
> >>
> >> I followed the discussion starting [1] till [2]. The second one
> >> mentions the interlock mechanism which has been implemented in 0005
> >> and 0006. While I don't have an objection to allowing LOCKing a
> >> sequence using the LOCK command, I am not sure whether it will
> >> actually work or is even needed.
> >>
> >> The problem described in [1] seems to be the same as the problem
> >> described in [2]. In both cases we see the sequence moving backwards
> >> during CATCHUP. At the end of catchup the sequence is in the right
> >> state in both the cases.
> >>
> >
> > I think we could see backward sequence value even after the catchup
> > phase (after the sync worker is exited and or the state of rel is
> > marked as 'ready' in pg_subscription_rel). The point is that there is
> > no guarantee that we will process all the pending WAL before
> > considering the sequence state is 'SYNCDONE' and or 'READY'. For
> > example, after copy_sequence, I see values like:
> >
> > postgres=# select * from s;
> > last_value | log_cnt | is_called
> > ------------+---------+-----------
> > 165 | 0 | t
> > (1 row)
> > postgres=# select nextval('s');
> > nextval
> > ---------
> > 166
> > (1 row)
> > postgres=# select nextval('s');
> > nextval
> > ---------
> > 167
> > (1 row)
> > postgres=# select currval('s');
> > currval
> > ---------
> > 167
> > (1 row)
> >
> > Then during the catchup phase:
> > postgres=# select * from s;
> > last_value | log_cnt | is_called
> > ------------+---------+-----------
> > 33 | 0 | t
> > (1 row)
> > postgres=# select * from s;
> > last_value | log_cnt | is_called
> > ------------+---------+-----------
> > 66 | 0 | t
> > (1 row)
> >
> > postgres=# select * from pg_subscription_rel;
> > srsubid | srrelid | srsubstate | srsublsn
> > ---------+---------+------------+-----------
> > 16394 | 16390 | r | 0/16374E8
> > 16394 | 16393 | s | 0/1637700
> > (2 rows)
> >
> > postgres=# select * from pg_subscription_rel;
> > srsubid | srrelid | srsubstate | srsublsn
> > ---------+---------+------------+-----------
> > 16394 | 16390 | r | 0/16374E8
> > 16394 | 16393 | r | 0/1637700
> > (2 rows)
> >
> > Here Sequence relid id 16393. You can see sequence state is marked as ready.
> >
> > postgres=# select * from s;
> > last_value | log_cnt | is_called
> > ------------+---------+-----------
> > 66 | 0 | t
> > (1 row)
> >
> > Even after that, see below the value of the sequence is still not
> > caught up. Later, when the apply worker processes all the WAL, the
> > sequence state will be caught up.
> >
> > postgres=# select * from s;
> > last_value | log_cnt | is_called
> > ------------+---------+-----------
> > 165 | 0 | t
> > (1 row)
> >
> > So, there will be a window where the sequence won't be caught up for a
> > certain period of time and any usage of it (even after the sync is
> > finished) during that time could result in inconsistent behaviour.
> >
>
> I'm rather confused about which node these queries are executed on.
> Presumably some of it is on publisher, some on subscriber?
>

These are all on the subscriber.

> Can you create a reproducer (TAP test demonstrating this?) I guess it
> might require adding some sleeps to hit the right timing ...
>

I have used the debugger to reproduce this as it needs quite some
coordination. I just wanted to see if the sequence can go backward and
didn't catch up completely before the sequence state is marked
'ready'. On the publisher side, I created a publication with a table
and a sequence. Then did the following steps:
SELECT nextval('s') FROM generate_series(1,50);
insert into t1 values(1);
SELECT nextval('s') FROM generate_series(51,150);

Then on the subscriber side with some debugging aid, I could find the
values in the sequence shown in the previous email. Sorry, I haven't
recorded each and every step but, if you think it helps, I can again
try to reproduce it and share the steps.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-07-25 10:47:51 Re: logicalrep_message_type throws an error
Previous Message Aleksander Alekseev 2023-07-25 10:16:11 Re: POC: Extension for adding distributed tracing - pg_tracing