Re: Logical Replication of sequences

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Hou, Zhijie/侯 志杰 <houzj(dot)fnst(at)fujitsu(dot)com>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Subject: Re: Logical Replication of sequences
Date: 2024-07-16 00:30:22
Message-ID: CAHut+PuFH1OCj-P1UKoRQE2X4-0zMG+N1V7jdn=tOQV4RNbAbw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I was reading back through this thread to find out how the proposed new
command for refreshing sequences, came about. The patch 0705 introduces a
new command syntax for ALTER SUBSCRIPTION ... REFRESH SEQUENCES

So now there are 2 forms of subscription refresh.

#1. ALTER SUBSCRIPTION name REFRESH PUBLICATION [ WITH ( refresh_option [=
value] [, ... ] ) ]

#2. ALTER SUBSCRIPTION name REFRESH SEQUENCES

~~~~

IMO, that separation seems complicated. It leaves many questions like:
* It causes a bit of initial confusion. e.g. When I saw the REFRESH
SEQUENCES I first assumed that was needed because sequences were
not covered by the existing REFRESH PUBLICATION
* Why wasn't command #2 called ALTER SUBSCRIPTION REFRESH PUBLICATION
SEQUENCES? E.g. missing keyword PUBLICATION. It seems inconsistent.
* I expect sequence values can become stale pretty much immediately after
command #1, so the user will want to use command #2 anyway...
* ... but if command #2 also does add/remove changed sequences same as
command #1 then what benefit was there of having the command #1 for
sequences?
* There is a separation of sequences (from tables) in command #2 but there
is no separation currently possible in command #1. It seemed inconsistent.

~~~

IIUC some of the goals I saw in the thread are to:
* provide a way to fetch and refresh sequences that also keeps behaviors
(e.g. copy_data etc.) consistent with the refresh of subscription tables
* provide a way to fetch and refresh *only* sequences

I felt you could just enhance the existing refresh command syntax (command
#1), instead of introducing a new one it would be simpler and it would
still meet those same objectives.

Synopsis:
ALTER SUBSCRIPTION name REFRESH PUBLICATION [TABLES | SEQUENCES | ALL] [
WITH ( refresh_option [= value] [, ... ] ) ]

My only change is the introduction of the optional "[TABLES | SEQUENCES |
ALL]" clause.

I believe that can do everything your current patch does, plus more:
* Can refresh *only* TABLES if that is what you want (current patch 0705
cannot do this)
* Can refresh *only* SEQUENCES (same as current patch 0705 command #2)
* Has better integration with refresh options like "copy_data" (current
patch 0705 command #2 doesn't have options)
* Existing REFRESH PUBLICATION syntax still works as-is. You can decide
later what is PG18 default if the "[TABLES | SEQUENCES | ALL]" is omitted.

~~~

More examples using proposed syntax.

ex1.
ALTER SUBSCRIPTION sub REFRESH PUBLICATION TABLES WITH (copy_data = false)
- same as PG17 functionality for ALTER SUBSCRIPTION sub REFRESH PUBLICATION
WITH (copy_data = false)

ex2.
ALTER SUBSCRIPTION sub REFRESH PUBLICATION TABLES WITH (copy_data = true)
- same as PG17 functionality for ALTER SUBSCRIPTION sub REFRESH PUBLICATION
WITH (copy_data = true)

ex3.
ALTER SUBSCRIPTION sub REFRESH PUBLICATION SEQUENCES WITH (copy data =
false)
- this adds/removes only sequences to pg_subscription_rel but doesn't
update their sequence values

ex4.
ALTER SUBSCRIPTION sub REFRESH PUBLICATION SEQUENCES WITH (copy data = true)
- this adds/removes only sequences to pg_subscription_rel and also updates
all sequence values.
- this is equivalent behaviour of what your current 0705 patch is doing for
command #2, ALTER SUBSCRIPTION sub REFRESH SEQUENCES

ex5.
ALTER SUBSCRIPTION sub REFRESH PUBLICATION ALL WITH (copy_data = false)
- this is equivalent behaviour of what your current 0705 patch is doing for
command #1, ALTER SUBSCRIPTION sub REFRESH PUBLICATION WITH (copy_data =
false)

ex6.
ALTER SUBSCRIPTION sub REFRESH PUBLICATION ALL WITH (copy_data = true)
- this adds/removes tables and sequences and updates all table initial data
sequence values.- I think it is equivalent to your current 0705 patch doing
command #1 ALTER SUBSCRIPTION sub REFRESH PUBLICATION WITH (copy_data =
true), followed by another command #2 ALTER SUBSCRIPTION sub REFRESH
SEQUENCES

ex7.
ALTER SUBSCRIPTION sub REFRESH PUBLICATION SEQUENCES
- Because default copy_data is true you do not need to specify options, so
this is the same behaviour as your current 0705 patch command #2, ALTER
SUBSCRIPTION sub REFRESH SEQUENCES.

~~~

I hope this post was able to demonstrate that by enhancing the existing
command:
- it is less tricky to understand the separate command distinctions
- there is more functionality/flexibility possible
- there is better integration with the refresh options like copy_data
- behaviour for tables/sequences is more consistent

Anyway, it is just my opinion. Maybe there are some pitfalls I'm unaware of.

Thoughts?

======
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2024-07-16 00:44:57 Duplicate unique key values in inheritance tables
Previous Message Thomas Munro 2024-07-16 00:22:11 Re: tests fail on windows with default git settings