Re: Logical Replication of sequences

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Euler Taveira <euler(at)eulerto(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Subject: Re: Logical Replication of sequences
Date: 2025-04-17 08:25:33
Message-ID: CAHut+Ps2LzJwPGB8i2_ViS9c9VxeAeqDqvH5R8E-M8HvWeNfAQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Vignesh,

Some review comments for patch v20250416-0005 (docs)

======
doc/src/sgml/catalogs.sgml

(52.55. pg_subscription_rel)

1.
<para>
State code:
<literal>i</literal> = initialize,
- <literal>d</literal> = data is being copied,
- <literal>f</literal> = finished table copy,
- <literal>s</literal> = synchronized,
+ <literal>d</literal> = data is being copied (not applicable
for sequences),
+ <literal>f</literal> = finished table copy (not applicable for
sequences),
+ <literal>s</literal> = synchronized (not applicable for sequences),
<literal>r</literal> = ready (normal replication)
</para></entry>

Would this be simpler if you used separate paragraphs for tables and sequences?

SUGGESTION
State codes for tables: i = initialize, d = data is being copied, f =
finished table copy, s = synchronized, d = data is being copied, f =
finished table copy, s = synchronized, r = ready (normal replication)

State codes for sequences: i = initialize, r = ready (normal replication)

======
doc/src/sgml/logical-replication.sgml

(29.1 Publication)

2.
- generated from a table or a group of tables, and might also be described as
- a change set or replication set. Each publication exists in only
one database.
+ generated from a table or a group of tables or current state of all
+ sequences, and might also be described as a change set or replication set.
+ Each publication exists in only one database.

/or current state/or the current state/

~~~

3.
<para>
Publications are different from schemas and do not affect how the table is
accessed. Each table can be added to multiple publications if needed.
- Publications may currently only contain tables and all tables in schema.
+ Publications may currently only contain sequences or tables/all
tables in schema.
Objects must be added explicitly, except when a publication is created for
- <literal>ALL TABLES</literal>.
+ <literal>ALL TABLES</literal> and <literal>ALL SEQUENCES</literal>.

I don't think you need to say "/all tables in schema" because here we
are talking about the *type* of objects that can be in the
publication. OTOH, the FOR TABLES IN SCHEMA should be in the next
sentence.

SUGGESTION
Publications may currently only contain tables or sequences. Objects
must be added explicitly, except when a publication is created using
FOR TABLES IN SCHEMA, or FOR ALL TABLES, or FOR ALL SEQUENCES.

~~~

(29.7.2. Refreshing Stale Sequences)

4.
+ <para>
+ To verify, compare the sequences values between the publisher and
+ subscriber, and if necessary, execute
+ <link linkend="sql-altersubscription-params-refresh-publication-sequences">
+ <command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION
SEQUENCES</command></link>.
+ </para>

/sequences values/sequence values/

Should we elaborate or give an example here, exactly how the user
should "compare the sequence values between the publisher and
subscriber".

~~~

(29.9. Restrictions)

5.
+ On the subscriber, a sequence will retain the last value it synchronized
+ from the publisher either during the initial
+ <command>CREATE SUBSCRIPTION</command> or
+ <command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES</command>.

You haven't mentioned ALTER SUBSCRIPTION ... REFRESH PUBLICATION, but
AFAIK that could also synchronize the latest values for any newly
added sequences. It seems a bit tedious to name all these commands all
the time, but I am not sure if there is a better way.

======
doc/src/sgml/ref/alter_subscription.sgml

6.
+ <para>
+ See <xref linkend="sequence-definition-mismatches"/> for
recommendations on how
+ to handle any warnings about sequence definition differences between
+ the publisher and the subscriber, which might occur when
+ <literal>copy_data = true</literal>.
+ </para>

I have a question about functionality: I understand we do not actually
"synchronize" sequence data at this time if the copy_data=false, but
OTOH, shouldn't we still be checking (and WARNING) if there are any
pub/sub sequences difference detected, regardless of the copy_data
bool value? Otherwise, I think all we are doing is deferring the
checking/warning until later (e.g. during REFRESH PUBLICATION
SEQUENCES). Isn't it is better to get the warning earlier so the user
can fix it earlier?

======
doc/src/sgml/ref/create_subscription.sgml

7.
+ <para>
+ See <xref linkend="sequence-definition-mismatches"/>
+ for recommendations on how to handle any warnings about sequence
+ definition differences between the publisher and the subscriber,
+ which might occur when <literal>copy_data = true</literal>.
+ </para>

ditto previous review comment #6.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message torikoshia 2025-04-17 09:34:57 Re: Align memory context level numbering in pg_log_backend_memory_contexts()
Previous Message Peter Smith 2025-04-17 08:21:34 Re: Logical Replication of sequences