Re: Logical Replication of sequences

From: vignesh C <vignesh21(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Euler Taveira <euler(at)eulerto(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Hou, Zhijie/侯 志杰 <houzj(dot)fnst(at)fujitsu(dot)com>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Subject: Re: Logical Replication of sequences
Date: 2024-08-08 15:55:03
Message-ID: CALDaNm0Tj+zn1_RyxRQMxRBtBuO9zbipKJ4o0u0vXc97+pJFzQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 7 Aug 2024 at 10:27, shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Mon, Aug 5, 2024 at 10:26 AM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> >
> > On Thu, 1 Aug 2024 at 04:25, Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> > >
> > > Hi Vignesh,
> > >
> > > I noticed that when replicating sequences (using the latest patches
> > > 0730_2*) the subscriber-side checks the *existence* of the sequence,
> > > but apparently it is not checking other sequence attributes.
> > >
> > > For example, consider:
> > >
> > > Publisher: "CREATE SEQUENCE s1 START 1 INCREMENT 2;" should be a
> > > sequence of only odd numbers.
> > > Subscriber: "CREATE SEQUENCE s1 START 2 INCREMENT 2;" should be a
> > > sequence of only even numbers.
> > >
> > > Because the names match, currently the patch allows replication of the
> > > s1 sequence. I think that might lead to unexpected results on the
> > > subscriber. IMO it might be safer to report ERROR unless the sequences
> > > match properly (i.e. not just a name check).
> > >
> > > Below is a demonstration the problem:
> > >
> > > ==========
> > > Publisher:
> > > ==========
> > >
> > > (publisher sequence is odd numbers)
> > >
> > > test_pub=# create sequence s1 start 1 increment 2;
> > > CREATE SEQUENCE
> > > test_pub=# select * from nextval('s1');
> > > nextval
> > > ---------
> > > 1
> > > (1 row)
> > >
> > > test_pub=# select * from nextval('s1');
> > > nextval
> > > ---------
> > > 3
> > > (1 row)
> > >
> > > test_pub=# select * from nextval('s1');
> > > nextval
> > > ---------
> > > 5
> > > (1 row)
> > >
> > > test_pub=# CREATE PUBLICATION pub1 FOR ALL SEQUENCES;
> > > CREATE PUBLICATION
> > > test_pub=#
> > >
> > > ==========
> > > Subscriber:
> > > ==========
> > >
> > > (subscriber sequence is even numbers)
> > >
> > > test_sub=# create sequence s1 start 2 increment 2;
> > > CREATE SEQUENCE
> > > test_sub=# SELECT * FROM nextval('s1');
> > > nextval
> > > ---------
> > > 2
> > > (1 row)
> > >
> > > test_sub=# SELECT * FROM nextval('s1');
> > > nextval
> > > ---------
> > > 4
> > > (1 row)
> > >
> > > test_sub=# SELECT * FROM nextval('s1');
> > > nextval
> > > ---------
> > > 6
> > > (1 row)
> > >
> > > test_sub=# CREATE SUBSCRIPTION sub1 CONNECTION 'dbname=test_pub'
> > > PUBLICATION pub1;
> > > 2024-08-01 08:43:04.198 AEST [24325] WARNING: subscriptions created
> > > by regression test cases should have names starting with "regress_"
> > > WARNING: subscriptions created by regression test cases should have
> > > names starting with "regress_"
> > > NOTICE: created replication slot "sub1" on publisher
> > > CREATE SUBSCRIPTION
> > > test_sub=# 2024-08-01 08:43:04.294 AEST [26240] LOG: logical
> > > replication apply worker for subscription "sub1" has started
> > > 2024-08-01 08:43:04.309 AEST [26244] LOG: logical replication
> > > sequence synchronization worker for subscription "sub1" has started
> > > 2024-08-01 08:43:04.323 AEST [26244] LOG: logical replication
> > > synchronization for subscription "sub1", sequence "s1" has finished
> > > 2024-08-01 08:43:04.323 AEST [26244] LOG: logical replication
> > > sequence synchronization worker for subscription "sub1" has finished
> > >
> > > (after the CREATE SUBSCRIPTION we are getting replicated odd values
> > > from the publisher, even though the subscriber side sequence was
> > > supposed to be even numbers)
> > >
> > > test_sub=# SELECT * FROM nextval('s1');
> > > nextval
> > > ---------
> > > 7
> > > (1 row)
> > >
> > > test_sub=# SELECT * FROM nextval('s1');
> > > nextval
> > > ---------
> > > 9
> > > (1 row)
> > >
> > > test_sub=# SELECT * FROM nextval('s1');
> > > nextval
> > > ---------
> > > 11
> > > (1 row)
> > >
> > > (Looking at the description you would expect odd values for this
> > > sequence to be impossible)
>
> I see that for such even sequences, user can still do 'setval' to a
> odd number and then nextval will keep on returning odd value.
>
> postgres=# SELECT nextval('s1');
> 6
>
> postgres=SELECT setval('s1', 43);
> 43
>
> postgres=# SELECT nextval('s1');
> 45
>
> > > test_sub=# \dS+ s1
> > > Sequence "public.s1"
> > > Type | Start | Minimum | Maximum | Increment | Cycles? | Cache
> > > --------+-------+---------+---------------------+-----------+---------+-------
> > > bigint | 2 | 1 | 9223372036854775807 | 2 | no | 1
> >
> > Even if we check the sequence definition during the CREATE
> > SUBSCRIPTION/ALTER SUBSCRIPTION ... REFRESH PUBLICATION or ALTER
> > SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES commands, there's still
> > a chance that the sequence definition might change after the command
> > has been executed. Currently, there's no mechanism to lock a sequence,
> > and we also permit replication of table data even if the table
> > structures differ, such as mismatched data types like int and
> > smallint. I have modified it to log a warning to inform users that the
> > sequence options on the publisher and subscriber are not the same and
> > advise them to ensure that the sequence definitions are consistent
> > between both.
> > The v20240805 version patch attached at [1] has the changes for the same.
> > [1] - https://www.postgresql.org/message-id/CALDaNm1Y_ot-jFRfmtwDuwmFrgSSYHjVuy28RspSopTtwzXy8w%40mail.gmail.com
>
> The behavior for applying is no different from setval. Having said
> that, I agree that sequence definition can change even after the
> subscription creation, but earlier we were not syncing sequences and
> thus the value of a particular sequence was going to remain in the
> range/pattern defined by its attributes unless user sets it manually
> using setval. But now, it is being changed in the background without
> user's knowledge.
> The table case is different. In case of table replication, if we have
> CHECK constraint or say primary-key etc, then the value which violates
> these constraints will never be inserted to a table even during
> replication on sub. For sequences, parameters (MIN,MAX, START,
> INCREMENT) can be considered similar to check-constraints, the only
> difference is during apply, we are still overriding these and copying
> pub's value. May be such inconsistencies detection can be targeted
> later in next project. But for the time being, it will be good to add
> a 'caveat' section in doc mentioning all such cases. The scope of this
> project should be clearly documented.

I have added a Caveats section and mentioned it.
The changes for the same are available at v20240808 version attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm1QQK_Pgx35LrJGuRxBzzYSO8rm1YGJF4w8hYc3Gm%2B5NQ%40mail.gmail.com

Regards,
Vignesh

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2024-08-08 15:59:52 Re: [PATCH] Add crc32(text) & crc32(bytea)
Previous Message vignesh C 2024-08-08 15:52:15 Re: Logical Replication of sequences