Re: Logical Replication of sequences

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Hou, Zhijie/侯 志杰 <houzj(dot)fnst(at)fujitsu(dot)com>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Subject: Re: Logical Replication of sequences
Date: 2024-07-01 07:27:22
Message-ID: CAHut+Pvrk75vSDkaXJVmhhZuuqQSY98btWJV=BMZAnyTtKRB4g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here are some review comments for the patch v20240625-0002

======
Commit Message

1.
This commit enhances logical replication by enabling the inclusion of all
sequences in publications. This improvement facilitates seamless
synchronization of sequence data during operations such as
CREATE SUBSCRIPTION, REFRESH PUBLICATION, and REFRESH PUBLICATION SEQUENCES.

~

Isn't this description getting ahead of the functionality a bit? For
example, it talks about operations like REFRESH PUBLICATION SEQUENCES
but AFAIK that syntax does not exist just yet.

~~~

2.
The commit message should mention that you are only introducing new
syntax for "FOR ALL SEQUENCES" here, but syntax for "FOR SEQUENCE" is
being deferred to some later patch. Without such a note it is not
clear why the gram.y syntax and docs seemed only half done.

======
doc/src/sgml/ref/create_publication.sgml

3.
<varlistentry id="sql-createpublication-params-for-all-tables">
<term><literal>FOR ALL TABLES</literal></term>
+ <term><literal>FOR ALL SEQUENCES</literal></term>
<listitem>
<para>
- Marks the publication as one that replicates changes for all tables in
- the database, including tables created in the future.
+ Marks the publication as one that replicates changes for all tables or
+ sequences in the database, including tables created in the future.

It might be better here to keep descriptions for "ALL TABLES" and "ALL
SEQUENCES" separated, otherwise the wording does not quite seem
appropriate for sequences (e.g. where it says "including tables
created in the future").

~~~

NITPICK - missing spaces
NITPICK - removed Oxford commas since previously there were none

~~~

4.
+ If <literal>FOR TABLE</literal>, <literal>FOR ALL TABLES</literal>,
+ <literal>FOR ALL SEQUENCES</literal>,or <literal>FOR TABLES IN
SCHEMA</literal>
+ are not specified, then the publication starts out with an empty set of
+ tables. That is useful if tables or schemas are to be added later.

It seems like "FOR ALL SEQUENCES" is out of place since it is jammed
between other clauses referring to TABLES. Would it be better to
mention SEQUENCES last in the list?

~~~

5.
+ rights on the table. The <command>FOR ALL TABLES</command>,
+ <command>FOR ALL SEQUENCES</command>, and
<command>FOR TABLES IN SCHEMA</command> clauses require the invoking

ditto of #4 above.

======
src/backend/catalog/pg_publication.c

GetAllSequencesPublicationRelations:

NITPICK - typo /relation/relations/

======
src/backend/commands/publicationcmds.c

6.
+ foreach(lc, stmt->for_all_objects)
+ {
+ char *val = strVal(lfirst(lc));
+
+ if (strcmp(val, "tables") == 0)
+ for_all_tables = true;
+ else if (strcmp(val, "sequences") == 0)
+ for_all_sequences = true;
+ }

Consider the foreach_ptr macro to slightly simplify this code.
Actually, this whole logic seems cumbersome -- can’t the parser assign
flags automatically. Please see my more detailed comment #10 below
about this in gram.y

~~~

7.
/* FOR ALL TABLES requires superuser */
- if (stmt->for_all_tables && !superuser())
+ if (for_all_tables && !superuser())
ereport(ERROR,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
errmsg("must be superuser to create FOR ALL TABLES publication")));

+ /* FOR ALL SEQUENCES requires superuser */
+ if (for_all_sequences && !superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to create FOR ALL SEQUENCES publication")));
+

The current code is easy to read, but I wonder if it should try harder
to share common code, or at least a common translatable message like
"must be superuser to create %s publication".

~~~

8.
- else
+
+ /*
+ * If the publication might have either tables or sequences (directly or
+ * through a schema), process that.
+ */
+ if (!for_all_tables || !for_all_sequences)

I did not understand why this code cannot just say "else" like before,
because the direct or through-schema syntax cannot be specified at the
same time as "FOR ALL ...", so why is the more complicated condition
necessary? Also, the similar code in AlterPublicationOptions() was not
changed to be like this.

======
src/backend/parser/gram.y

9. comment

*
* CREATE PUBLICATION FOR ALL TABLES [WITH options]
*
+ * CREATE PUBLICATION FOR ALL SEQUENCES [WITH options]
+ *
* CREATE PUBLICATION FOR pub_obj [, ...] [WITH options]

The comment is not quite correct because actually you are allowing
simultaneous FOR ALL TABLES, SEQUENCES. It should be more like:

CREATE PUBLICATION FOR ALL pub_obj_type [,...] [WITH options]

pub_obj_type is one of:
TABLES
SEQUENCES

~~~

10.
+pub_obj_type: TABLES
+ { $$ = (Node *) makeString("tables"); }
+ | SEQUENCES
+ { $$ = (Node *) makeString("sequences"); }
+ ;
+
+pub_obj_type_list: pub_obj_type
+ { $$ = list_make1($1); }
+ | pub_obj_type_list ',' pub_obj_type
+ { $$ = lappend($1, $3); }
+ ;

IIUC the only thing you need is a flag to say if FOR ALL TABLE is in
effect and another flag to say if FOR ALL SEQUENCES is in effect. So,
It seemed clunky to build up a temporary list of "tables" or
"sequences" strings here, which is subsequently scanned by
CreatePublication to be turned back into booleans.

Can't we just change the CreatePublicationStmt field to have:

A) a 'for_all_types' bitmask instead of a list:
0x0000 means FOR ALL is not specified
0x0001 means ALL TABLES
0x0010 means ALL SEQUENCES

Or, B) have 2 boolean fields ('for_all_tables' and 'for_all_sequences')

...where the gram.y code can be written to assign the flag/s values directly?

======
src/bin/pg_dump/pg_dump.c

11.
if (pubinfo->puballtables)
appendPQExpBufferStr(query, " FOR ALL TABLES");

+ if (pubinfo->puballsequences)
+ appendPQExpBufferStr(query, " FOR ALL SEQUENCES");
+

Hmm. Is that correct? It looks like a possible bug, because if both
flags are true it will give invalid syntax like "FOR ALL TABLES FOR
ALL SEQUENCES" instead of "FOR ALL TABLES, SEQUENCES"

======
src/bin/pg_dump/t/002_pg_dump.pl

12.
This could also try the test scenario of both FOR ALL being
simultaneously set ("FOR ALL TABLES, SEQUENCES") to check for bugs
like the suspected one in dump.c review comment #11 above.

======
src/bin/psql/describe.c

13.
+ if (pset.sversion >= 170000)
+ printfPQExpBuffer(&buf,
+ "SELECT pubname AS \"%s\",\n"
+ " pg_catalog.pg_get_userbyid(pubowner) AS \"%s\",\n"
+ " puballtables AS \"%s\",\n"
+ " puballsequences AS \"%s\",\n"
+ " pubinsert AS \"%s\",\n"
+ " pubupdate AS \"%s\",\n"
+ " pubdelete AS \"%s\"",
+ gettext_noop("Name"),
+ gettext_noop("Owner"),
+ gettext_noop("All tables"),
+ gettext_noop("All sequences"),
+ gettext_noop("Inserts"),
+ gettext_noop("Updates"),
+ gettext_noop("Deletes"));
+ else
+ printfPQExpBuffer(&buf,
+ "SELECT pubname AS \"%s\",\n"
+ " pg_catalog.pg_get_userbyid(pubowner) AS \"%s\",\n"
+ " puballtables AS \"%s\",\n"
+ " pubinsert AS \"%s\",\n"
+ " pubupdate AS \"%s\",\n"
+ " pubdelete AS \"%s\"",
+ gettext_noop("Name"),
+ gettext_noop("Owner"),
+ gettext_noop("All tables"),
+ gettext_noop("Inserts"),
+ gettext_noop("Updates"),
+ gettext_noop("Deletes"));
+

IMO this should be coded differently so that only the
"puballsequences" column is guarded by the (pset.sversion >= 170000),
and everything else is the same as before. This suggested way would
also be consistent with the existing code version checks (e.g. for
"pubtruncate" or for "pubviaroot").

~~~

NITPICK - Add blank lines
NITPICK - space in "ncols ++"

======
src/bin/psql/tab-complete.c

14.
Hmm. When I tried this, it didn't seem to be working properly.

For example "CREATE PUBLICATION pub1 FOR ALL" only completes with
"TABLES" but not "SEQUENCES".
For example "CREATE PUBLICATION pub1 FOR ALL SEQ" doesn't complete
"SEQUENCES" properly

======
src/include/catalog/pg_publication.h

NITPICK - move the extern to be adjacent to others like it.

======
src/include/nodes/parsenodes.h

15.
- bool for_all_tables; /* Special publication for all tables in db */
+ List *for_all_objects; /* Special publication for all objects in
+ * db */
} CreatePublicationStmt;

I felt this List logic is a bit strange. See my comment #10 in gram.y
for more details.

~~~

16.
- bool for_all_tables; /* Special publication for all tables in db */
+ List *for_all_objects; /* Special publication for all objects in
+ * db */

Ditto comment #15 in AlterPublicationStmt

======
src/test/regress/sql/publication.sql

17.
+CREATE SEQUENCE testpub_seq0;
+CREATE SEQUENCE pub_test.testpub_seq1;
+
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_forallsequences FOR ALL SEQUENCES;
+RESET client_min_messages;
+
+SELECT pubname, puballtables, puballsequences FROM pg_publication
WHERE pubname = 'testpub_forallsequences';
+\d+ pub_test.testpub_seq1

Should you also do "\d+ tespub_seq0" here? Otherwise what was the
point of defining the seq0 sequence being in this test?

~~~

18.
Maybe there are missing test cases for different syntax combinations like:

FOR ALL TABLES, SEQUENCES
FOR ALL SEQUENCES, TABLES

Note that the current list logic of this patch even considers my
following bogus statement syntax is OK.

test_pub=# CREATE PUBLICATION pub_silly FOR ALL TABLES, SEQUENCES,
TABLES, TABLES, TABLES, SEQUENCES;
CREATE PUBLICATION
test_pub=#

======
99.
Please also refer to the attached nitpicks patch which implements all
the cosmetic issues identified above as NITPICKS.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Attachment Content-Type Size
PS_NITPICKS_20240701_SEQ_0002.txt text/plain 3.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2024-07-01 07:35:09 Re: [PATCH] Fix docs to use canonical links
Previous Message jian he 2024-07-01 07:10:00 Re: pgsql: Add more SQL/JSON constructor functions