Re: Logical Replication of sequences

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Euler Taveira <euler(at)eulerto(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Hou, Zhijie/侯 志杰 <houzj(dot)fnst(at)fujitsu(dot)com>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Logical Replication of sequences
Date: 2024-08-07 04:57:09
Message-ID: CAJpy0uBDdN3302-FSYLNuGn=nsh3+W4dmLhp6RHnegFLkbmzig@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 5, 2024 at 10:26 AM vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
> On Thu, 1 Aug 2024 at 04:25, Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > Hi Vignesh,
> >
> > I noticed that when replicating sequences (using the latest patches
> > 0730_2*) the subscriber-side checks the *existence* of the sequence,
> > but apparently it is not checking other sequence attributes.
> >
> > For example, consider:
> >
> > Publisher: "CREATE SEQUENCE s1 START 1 INCREMENT 2;" should be a
> > sequence of only odd numbers.
> > Subscriber: "CREATE SEQUENCE s1 START 2 INCREMENT 2;" should be a
> > sequence of only even numbers.
> >
> > Because the names match, currently the patch allows replication of the
> > s1 sequence. I think that might lead to unexpected results on the
> > subscriber. IMO it might be safer to report ERROR unless the sequences
> > match properly (i.e. not just a name check).
> >
> > Below is a demonstration the problem:
> >
> > ==========
> > Publisher:
> > ==========
> >
> > (publisher sequence is odd numbers)
> >
> > test_pub=# create sequence s1 start 1 increment 2;
> > CREATE SEQUENCE
> > test_pub=# select * from nextval('s1');
> > nextval
> > ---------
> > 1
> > (1 row)
> >
> > test_pub=# select * from nextval('s1');
> > nextval
> > ---------
> > 3
> > (1 row)
> >
> > test_pub=# select * from nextval('s1');
> > nextval
> > ---------
> > 5
> > (1 row)
> >
> > test_pub=# CREATE PUBLICATION pub1 FOR ALL SEQUENCES;
> > CREATE PUBLICATION
> > test_pub=#
> >
> > ==========
> > Subscriber:
> > ==========
> >
> > (subscriber sequence is even numbers)
> >
> > test_sub=# create sequence s1 start 2 increment 2;
> > CREATE SEQUENCE
> > test_sub=# SELECT * FROM nextval('s1');
> > nextval
> > ---------
> > 2
> > (1 row)
> >
> > test_sub=# SELECT * FROM nextval('s1');
> > nextval
> > ---------
> > 4
> > (1 row)
> >
> > test_sub=# SELECT * FROM nextval('s1');
> > nextval
> > ---------
> > 6
> > (1 row)
> >
> > test_sub=# CREATE SUBSCRIPTION sub1 CONNECTION 'dbname=test_pub'
> > PUBLICATION pub1;
> > 2024-08-01 08:43:04.198 AEST [24325] WARNING: subscriptions created
> > by regression test cases should have names starting with "regress_"
> > WARNING: subscriptions created by regression test cases should have
> > names starting with "regress_"
> > NOTICE: created replication slot "sub1" on publisher
> > CREATE SUBSCRIPTION
> > test_sub=# 2024-08-01 08:43:04.294 AEST [26240] LOG: logical
> > replication apply worker for subscription "sub1" has started
> > 2024-08-01 08:43:04.309 AEST [26244] LOG: logical replication
> > sequence synchronization worker for subscription "sub1" has started
> > 2024-08-01 08:43:04.323 AEST [26244] LOG: logical replication
> > synchronization for subscription "sub1", sequence "s1" has finished
> > 2024-08-01 08:43:04.323 AEST [26244] LOG: logical replication
> > sequence synchronization worker for subscription "sub1" has finished
> >
> > (after the CREATE SUBSCRIPTION we are getting replicated odd values
> > from the publisher, even though the subscriber side sequence was
> > supposed to be even numbers)
> >
> > test_sub=# SELECT * FROM nextval('s1');
> > nextval
> > ---------
> > 7
> > (1 row)
> >
> > test_sub=# SELECT * FROM nextval('s1');
> > nextval
> > ---------
> > 9
> > (1 row)
> >
> > test_sub=# SELECT * FROM nextval('s1');
> > nextval
> > ---------
> > 11
> > (1 row)
> >
> > (Looking at the description you would expect odd values for this
> > sequence to be impossible)

I see that for such even sequences, user can still do 'setval' to a
odd number and then nextval will keep on returning odd value.

postgres=# SELECT nextval('s1');
6

postgres=SELECT setval('s1', 43);
43

postgres=# SELECT nextval('s1');
45

> > test_sub=# \dS+ s1
> > Sequence "public.s1"
> > Type | Start | Minimum | Maximum | Increment | Cycles? | Cache
> > --------+-------+---------+---------------------+-----------+---------+-------
> > bigint | 2 | 1 | 9223372036854775807 | 2 | no | 1
>
> Even if we check the sequence definition during the CREATE
> SUBSCRIPTION/ALTER SUBSCRIPTION ... REFRESH PUBLICATION or ALTER
> SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES commands, there's still
> a chance that the sequence definition might change after the command
> has been executed. Currently, there's no mechanism to lock a sequence,
> and we also permit replication of table data even if the table
> structures differ, such as mismatched data types like int and
> smallint. I have modified it to log a warning to inform users that the
> sequence options on the publisher and subscriber are not the same and
> advise them to ensure that the sequence definitions are consistent
> between both.
> The v20240805 version patch attached at [1] has the changes for the same.
> [1] - https://www.postgresql.org/message-id/CALDaNm1Y_ot-jFRfmtwDuwmFrgSSYHjVuy28RspSopTtwzXy8w%40mail.gmail.com

The behavior for applying is no different from setval. Having said
that, I agree that sequence definition can change even after the
subscription creation, but earlier we were not syncing sequences and
thus the value of a particular sequence was going to remain in the
range/pattern defined by its attributes unless user sets it manually
using setval. But now, it is being changed in the background without
user's knowledge.
The table case is different. In case of table replication, if we have
CHECK constraint or say primary-key etc, then the value which violates
these constraints will never be inserted to a table even during
replication on sub. For sequences, parameters (MIN,MAX, START,
INCREMENT) can be considered similar to check-constraints, the only
difference is during apply, we are still overriding these and copying
pub's value. May be such inconsistencies detection can be targeted
later in next project. But for the time being, it will be good to add
a 'caveat' section in doc mentioning all such cases. The scope of this
project should be clearly documented.

thanks
Shveta

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-08-07 05:23:36 Re: Conflict detection and logging in logical replication
Previous Message Peter Smith 2024-08-07 04:42:00 Re: Logical Replication of sequences