From: | Peter Smith <smithpb2250(at)gmail(dot)com> |
---|---|
To: | vignesh C <vignesh21(at)gmail(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "Shinoda, Noriyoshi (SXD Japan FSIP)" <noriyoshi(dot)shinoda(at)hpe(dot)com>, Shubham Khanna <khannashubham1197(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Rajendra Kumar Dangwal <dangwalrajendra888(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "euler(at)eulerto(dot)com" <euler(at)eulerto(dot)com> |
Subject: | Re: Pgoutput not capturing the generated columns |
Date: | 2025-01-17 05:52:33 |
Message-ID: | CAHut+PvrumYk7DxW6yLnx9ju9P24rU00-xw+fYbQqGS_gq4Kng@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Vignesh.
Some review comments for patch v52-0003
======
1. GENERAL - change to use enum.
On Thu, Jan 16, 2025 at 7:47 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
> On Wed, 15 Jan 2025 at 11:17, Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> > 2.
> > As suggested in more detail below, I think it would be better if you
> > can define a C code enum type for these potential values instead of
> > just using #define macros and char. I guess that will impact a lot of
> > the APIs.
>
> If we change it to enum, we will not be able to access
> PUBLISH_GENCOLS_NONE and PUBLISH_GENCOLS_STORED from describe.c files.
> Maybe that is the reason the macros were used in the case of
> pg_subscription.h also.
Hm. I am not sure. Can't you just define the enum inside the #ifdef
EXPOSE_TO_CLIENT_CODE? I saw some examples of this already (see
src/include/catalog/pg_cast.h)
e.g. I tried following, which compiles for me:
#ifdef EXPOSE_TO_CLIENT_CODE
typedef enum PublishGencolsType
{
/* Generated columns present should not be replicated. */
PUBLISH_GENCOLS_NONE = 'n',
/* Generated columns present should be replicated. */
PUBLISH_GENCOLS_STORED = 's',
} PublishGencolsType;
#endif /* EXPOSE_TO_CLIENT_CODE */
typedef struct Publication
{
Oid oid;
char *name;
bool alltables;
bool pubviaroot;
PublishGencolsType pubgencols_type;
PublicationActions pubactions;
} Publication;
======
doc/src/sgml/ref/create_publication.sgml
2.
<para>
Specifies whether the generated columns present in the tables
associated with the publication should be replicated.
- The default is <literal>false</literal>.
+ The default is <literal>none</literal> meaning the generated
+ columns present in the tables associated with publication will not be
+ replicated.
+ </para>
IMO it would be better to add another sentence before "The default
is..." to just spell out what the possible values are up-front. The
"The default is..." can also be preceded by a blank line.
SUGGESTION
Specifies whether the generated columns present in the tables
associated with the publication should be replicated. Possible values
are <literal>none</literal> and <literal>stored</literal>.
The default is <literal>none</literal> meaning...
======
src/backend/catalog/pg_publication.c
pub_form_cols_map:
3.
+ if (att->attgenerated)
+ {
+ if (att->attgenerated != ATTRIBUTE_GENERATED_STORED)
+ continue;
+ else if (include_gencols_type != PUBLISH_GENCOLS_STORED)
+ continue;
+ }
+
I find it easier to read without the 'else' keyword. Also, I think
these can be commented on.
SUGGESTION
/* We only support replication of STORED generated cols. */
if (att->attgenerated != ATTRIBUTE_GENERATED_STORED)
continue;
/* User hasn't requested to replicate STORED generated cols. */
if (include_gencols_type != PUBLISH_GENCOLS_STORED)
continue;
~~~
pg_get_publication_tables:
4.
- if (att->attisdropped || (att->attgenerated && !pub->pubgencols))
+ if (att->attisdropped)
continue;
+ if (att->attgenerated)
+ {
+ if (att->attgenerated != ATTRIBUTE_GENERATED_STORED)
+ continue;
+ else if (pub->pubgencols_type != PUBLISH_GENCOLS_STORED)
+ continue;
+ }
+
Ditto previous review comment #3 above.
======
src/backend/commands/publicationcmds.c
pub_contains_invalid_column:
5.
The function comment is still referring to "enabling
publish_generated_columns option" which kind of sounds like a boolean.
~~~
6.
/*
- * The publish_generated_columns option must be set to true if the
- * REPLICA IDENTITY contains any stored generated column.
+ * The publish_generated_columns option must be set to 's'(stored)
+ * if the REPLICA IDENTITY contains any stored generated column.
*/
- if (!pubgencols && att->attgenerated)
+ if (pubgencols_type == PUBLISH_GENCOLS_NONE && att->attgenerated)
The comment and the code seem a bit out of sync; the comment says must
be 's', so to match that the code ought to be checking !=
PUBLISH_GENCOLS_STORED.
~~~
defGetGeneratedColsOption:
7.
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("%s requires a \"none\" or \"stored\"",
+ def->defname));
+
Should that have the word "value"?
e.g. "%s requires a \"none\" or \"stored\" value
======
src/backend/replication/logical/proto.c
logicalrep_should_publish_column:
8.
+ return (att->attgenerated == ATTRIBUTE_GENERATED_STORED) ?
(include_gencols_type == PUBLISH_GENCOLS_STORED) : false;
The ternary is fine, but it might be neater to split it out so you can
comment on it
SUGGESTION
/* Stored generated columns are only published when the user sets
publish_generated_columns = stored. */
if (att->attgenerated == ATTRIBUTE_GENERATED_STORED)
return include_gencols_type == PUBLISH_GENCOLS_STORED;
return false;
======
src/backend/replication/pgoutput/pgoutput.c
typedef struct RelationSyncEntry:
9.
/*
- * This is set if the 'publish_generated_columns' parameter is true, and
- * the relation contains generated columns.
+ * This will be 's'(stored) if the relation contains generated columns and
+ * the 'publish_generated_columns' parameter is set to 's'(stored).
+ * Otherwise, it will be 'n'(none), indicating that no generated columns
+ * should be published.
*/
- bool include_gencols;
+ char include_gencols_type;
Is the comment strictly correct? What about overriding column lists
where the option is set to 'none'?
======
src/include/catalog/pg_publication.h
10.
- /* true if generated columns data should be published */
- bool pubgencols;
+ /*
+ * 's'stored) if generated column data should be published, 'n'(none) if
+ * it should not be published
+ */
+ char pubgencols_type;
Typo. Missing '('
Also better to put each value on a separate line, and explicitly
mention 'stored', etc.
SUGGESTION
/*
* 'n'(none) if generated column data should not be published.
* 's'(stored) if stored generated column data should be published.
*/
======
src/test/subscription/t/011_generated.pl
11.
# =============================================================================
# Exercise logical replication of a generated column to a subscriber side
# regular column. This is done both when the publication parameter
-# 'publish_generated_columns' is set to false (to confirm existing default
-# behavior), and is set to true (to confirm replication occurs).
+# 'publish_generated_columns' is set to 'none' (to confirm existing default
+# behavior), and is set to stored (to confirm replication occurs).
Why 'none' in quotes but stored not in quotes?
======
Kind Regards,
Peter Smith.
Fujitsu Australia
From | Date | Subject | |
---|---|---|---|
Next Message | Bertrand Drouvot | 2025-01-17 06:06:35 | Re: per backend I/O statistics |
Previous Message | Tom Lane | 2025-01-17 05:43:45 | Re: leafhopper: Assert("outerstartsel <= outerendsel"), File: "costsize.c" |