Quick Links

Re: Logical Replication of sequences

From:	Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To:	vignesh C <vignesh21(at)gmail(dot)com>
Cc:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Hou, Zhijie/侯志杰 <houzj(dot)fnst(at)fujitsu(dot)com>, "Katz, Jonathan" <jkatz(at)amazon(dot)com>
Subject:	Re: Logical Replication of sequences
Date:	2024-06-12 05:21:29
Message-ID:	CAFiTN-vg6JBdu93T74o8F3n_hHdaj+-YdD4jRbOCvybF_RwwRg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Jun 11, 2024 at 4:06 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
> Amit and I engaged in an offline discussion regarding the design and
> contemplated that it could be like below:

If I understand correctly, does this require the sequences to already
exist on the subscribing node before creating the subscription, or
will it also copy any non-existing sequences?

> 1) CREATE PUBLICATION syntax enhancement:
> CREATE PUBLICATION ... FOR ALL SEQUENCES;
> The addition of a new column titled "all sequences" in the
> pg_publication system table will signify whether the publication is
> designated as all sequences publication or not.
>
> 2) CREATE SUBSCRIPTION -- no syntax change.
> Upon creation of a subscription, the following additional steps will
> be managed by the subscriber:
> i) The subscriber will retrieve the list of sequences associated with
> the subscription's publications.
> ii) For each sequence: a) Retrieve the sequence value from the
> publisher by invoking the pg_sequence_state function. b) Set the
> sequence with the value obtained from the publisher. iv) Once the
> subscription creation is completed, all sequence values will become
> visible at the subscriber's end.
>
> An alternative design approach could involve retrieving the sequence
> list from the publisher during subscription creation and inserting the
> sequences with an "init" state into the pg_subscription_rel system
> table. These tasks could be executed by a single sequence sync worker,
> which would:
> i) Retrieve the list of sequences in the "init" state from the
> pg_subscription_rel system table.
> ii) Initiate a transaction.
> iii) For each sequence: a) Obtain the sequence value from the
> publisher by utilizing the pg_sequence_state function. b) Update the
> sequence with the value obtained from the publisher.
> iv) Commit the transaction.
>
> The benefit with the second approach is that if there are large number
> of sequences, the sequence sync can be enhanced to happen in parallel
> and also if there are any locks held on the sequences in the
> publisher, the sequence worker can wait to acquire the lock instead of
> blocking the whole create subscription command which will delay the
> initial copy of the tables too.

Yeah w.r.t. this point second approach seems better.

> 3) Refreshing the sequence can be achieved through the existing
> command: ALTER SUBSCRIPTION ... REFRESH PUBLICATION(no syntax change
> here).
> The subscriber identifies stale sequences, meaning sequences present
> in pg_subscription_rel but absent from the publication, and removes
> them from the pg_subscription_rel system table. The subscriber also
> checks for newly added sequences in the publisher and synchronizes
> their values from the publisher using the steps outlined in the
> subscription creation process. It's worth noting that previously
> synchronized sequences won't be synchronized again; the sequence sync
> will occur solely for the newly added sequences.

> 4) Introducing a new command for refreshing all sequences: ALTER
> SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES.
> The subscriber will remove stale sequences and add newly added
> sequences from the publisher. Following this, it will re-synchronize
> the sequence values for all sequences in the updated list from the
> publisher, following the steps outlined in the subscription creation
> process.

Okay, this answers my first question: we will remove the sequences
that are removed from the publisher and add the new sequences. I don't
see any problem with this, but doesn't it seem like we are effectively
doing DDL replication only for sequences without having a
comprehensive plan for overall DDL replication?

> 5) Incorporate the pg_sequence_state function to fetch the sequence
> value from the publisher, along with the page LSN. Incorporate
> SetSequence function, which will procure a new relfilenode for the
> sequence and set the new relfilenode with the specified value. This
> will facilitate rollback in case of any failures.

I do not understand this point, you mean whenever we are fetching the
sequence value from the publisher we need to create a new relfilenode
on the subscriber? Why not just update the catalog tuple is
sufficient? Or this is for handling the ALTER SEQUENCE case?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Re: Logical Replication of sequences at 2024-06-11 10:36:06 from vignesh C

Responses

Re: Logical Replication of sequences at 2024-06-12 10:38:04 from vignesh C

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Fujii.Yuki@df.MitsubishiElectric.co.jp	2024-06-12 05:27:02	RE: Partial aggregates pushdown
Previous Message	Masahiko Sawada	2024-06-12 05:13:31	Re: Logical Replication of sequences