From: | vignesh C <vignesh21(at)gmail(dot)com> |
---|---|
To: | Peter Smith <smithpb2250(at)gmail(dot)com> |
Cc: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Handle infinite recursion in logical replication setup |
Date: | 2022-04-08 07:30:04 |
Message-ID: | CALDaNm1ei=rRwCBKWtUu8b5OsS6FFcvaxg9h0oXcjgFn8GoZnQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Apr 5, 2022 at 6:21 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Here are my comments for the latest patch v6-0001.
>
> (I will post my v6-0002 review comments separately)
>
> PATCH v6-0001 comments
> ======================
>
> 1.1 General - Option name
>
> I still feel like the option name is not ideal. Unfortunately, this is
> important because any name change would impact lots of these patch
> files and docs, struct members etc.
>
> It was originally called "local_only", but I thought that as a
> SUBSCRIPTION option that was confusing because "local" means local to
> what? Really it is local to the publisher, not local to the
> subscriber, so that name seemed misleading.
>
> Then I suggested "publish_local_only". Although that resolved the
> ambiguity problem, other people thought it seemed odd to have the
> "publish" prefix for a subscription-side option.
>
> So now it is changed again to "subscribe_local_only" -- It's getting
> better but still, it is implied that the "local" means local to the
> publisher except there is nothing in the option name really to convey
> that meaning. IMO we here all understand the meaning of this option
> mostly by familiarity with this discussion thread, but I think a user
> coming to this for the first time will still be confused by the name.
>
> Below are some other name ideas. Maybe they are not improvements, but
> it might help other people to come up with something better.
>
> subscribe_publocal_only = true/false
> origin = pub_only/any
> adjacent_data_only = true/false
> direct_subscriptions_only = true/false
> ...
>
>
> (FYI, the remainder of these review comments will assume the option is
> still called "subscribe_local_only")
Modified to local_only
> ~~~
>
> 1.2 General - inconsistent members and args
>
> IMO the struct members and args should also be named for close
> consistency with whatever the option name is.
>
> Currently the option is called "subscription_local_only". So I think
> the members/args would be better to be called "local_only" instead of
> "only_local".
Modified
> ~~~
>
> 1.3 Commit message - wrong option name
>
> The commit message refers to the option name as "publish_local_only"
> instead of the option name that is currently implemented.
Modified
> ~~~
>
> 1.4 Commit message - wording
>
> The wording seems a bit off. Below is suggested simpler wording which
> I AFAIK conveys the same information.
>
> BEFORE
> Add an option publish_local_only which will subscribe only to the locally
> generated data in the publisher node. If subscriber is created with this
> option, publisher will skip publishing the data that was subscribed
> from other nodes. It can be created using following syntax:
> ex: CREATE SUBSCRIPTION sub1 CONNECTION 'dbname =postgres port=9999'
> PUBLICATION pub1 with (publish_local_only = on);
>
> SUGGESTION
> This patch adds a new SUBSCRIPTION boolean option
> "subscribe_local_only". The default is false. When a SUBSCRIPTION is
> created with this option enabled, the publisher will only publish data
> that originated at the publisher node.
> Usage:
> CREATE SUBSCRIPTION sub1 CONNECTION 'dbname =postgres port=9999'
> PUBLICATION pub1 with (subscribe_local_only = true);
Modified
> ~~~
>
> 1.5 doc/src/sgml/ref/create_subscription.sgml - "generated" changes.
>
> + <para>
> + Specifies whether the subscription will request the publisher to send
> + locally generated changes or both the locally generated changes and
> + the replicated changes that was generated from other nodes. The
> + default is <literal>false</literal>.
> + </para>
>
> For some reason, it seemed a bit strange to me to use the term
> "generated" changes. Maybe better to refer to the origin of changes?
>
> SUGGESTION
> Specifies whether the publisher should send only changes that
> originated locally at the publisher node, or send any publisher node
> changes regardless of their origin. The default is false.
Modified
> ~~~
>
> 1.6 src/backend/replication/pgoutput/pgoutput.c -
> LOGICALREP_PROTO_TWOPHASE_VERSION_NUM
>
> @@ -496,6 +509,12 @@ pgoutput_startup(LogicalDecodingContext *ctx,
> OutputPluginOptions *opt,
> else
> ctx->twophase_opt_given = true;
>
> + if (data->only_local && data->protocol_version <
> LOGICALREP_PROTO_TWOPHASE_VERSION_NUM)
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("requested proto_version=%d does not support
> subscribe_local_only, need %d or higher",
> + data->protocol_version, LOGICALREP_PROTO_TWOPHASE_VERSION_NUM)));
>
> I thought this code should not be using
> LOGICALREP_PROTO_TWOPHASE_VERSION_NUM. Shouldn't there be some newly
> introduced constant like LOGICALREP_PROTO_LOCALONLY_VERSION_NUM which
> you will use here?
Modified, I have set LOGICALREP_PROTO_LOCALONLY_VERSION_NUM to same
value as LOGICALREP_PROTO_TWOPHASE_VERSION_NUM, will increment this
once server version is changed.
> ~~~
>
> 1.7 src/bin/pg_dump/pg_dump.c - 150000
>
> @@ -4451,11 +4452,13 @@ getSubscriptions(Archive *fout)
> if (fout->remoteVersion >= 150000)
> appendPQExpBufferStr(query,
> " s.subtwophasestate,\n"
> - " s.subdisableonerr\n");
> + " s.subdisableonerr,\n"
> + " s.sublocal\n");
> else
> appendPQExpBuffer(query,
> " '%c' AS subtwophasestate,\n"
> - " false AS subdisableonerr\n",
> + " false AS subdisableonerr,\n"
> + " false AS s.sublocal\n",
> LOGICALREP_TWOPHASE_STATE_DISABLED);
>
> I think this local_only feature is unlikely to get into the PG15
> release, so this code should be split out into a separate condition
> because later will need to change to say >= 160000.
I have split the condition and checked it with 150000, this will be
changed later to 160000 after the branch is created.
> ~~~
>
> 1.8 src/bin/pg_dump/pg_dump.c - dumpSubscription
>
> @@ -4585,6 +4591,9 @@ dumpSubscription(Archive *fout, const
> SubscriptionInfo *subinfo)
> if (strcmp(subinfo->subdisableonerr, "t") == 0)
> appendPQExpBufferStr(query, ", disable_on_error = true");
>
> + if (strcmp(subinfo->sublocal, "f") != 0)
> + appendPQExpBufferStr(query, ", subscribe_local_only = on");
> +
>
> I felt it is more natural to say "if it is true set to true", instead
> of "if it is not false set to on".
>
> SUGGESTION
> if (strcmp(subinfo->sublocal, "t") == 0)
> appendPQExpBufferStr(query, ", subscribe_local_only = true");
Modified
> ~~~
>
> 1.9 src/bin/psql/describe.c - 150000
>
> @@ -6318,9 +6318,11 @@ describeSubscriptions(const char *pattern, bool verbose)
> if (pset.sversion >= 150000)
> appendPQExpBuffer(&buf,
> ", subtwophasestate AS \"%s\"\n"
> - ", subdisableonerr AS \"%s\"\n",
> + ", subdisableonerr AS \"%s\"\n"
> + ", sublocal AS \"%s\"\n",
> gettext_noop("Two phase commit"),
> - gettext_noop("Disable on error"));
> + gettext_noop("Disable on error"),
> + gettext_noop("Only local"));
>
> I think this local_only feature is unlikely to get into the PG15
> release, so this code should be split out into a separate condition
> because later will need to change to say >= 160000.
I have split the condition and checked it with 150000, this will be
changed later to 160000 after the branch is created.
> ~~
>
> 1.10 src/bin/psql/describe.c - describeSubscriptions column
>
> @@ -6318,9 +6318,11 @@ describeSubscriptions(const char *pattern, bool verbose)
> if (pset.sversion >= 150000)
> appendPQExpBuffer(&buf,
> ", subtwophasestate AS \"%s\"\n"
> - ", subdisableonerr AS \"%s\"\n",
> + ", subdisableonerr AS \"%s\"\n"
> + ", sublocal AS \"%s\"\n",
> gettext_noop("Two phase commit"),
> - gettext_noop("Disable on error"));
> + gettext_noop("Disable on error"),
> + gettext_noop("Only local"));
>
> I think the column name here should be more consistent with the option
> name. e.g. it should be "Local only", not "Only local".
Modified
> ~~~
>
> 1.11 src/bin/psql/tab-complete.c - whitespace
>
> @@ -3167,7 +3167,7 @@ psql_completion(const char *text, int start, int end)
> /* Complete "CREATE SUBSCRIPTION <name> ... WITH ( <opt>" */
> else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
> COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
> - "enabled", "slot_name", "streaming",
> + "enabled", "slot_name", "streaming", "subscribe_local_only",
>
> The patch accidentally added a space char before the "slot_name".
Modified
> ~~~
>
> 1.12 src/include/replication/walreceiver.h - "generated"
>
> @@ -183,6 +183,7 @@ typedef struct
> bool streaming; /* Streaming of large transactions */
> bool twophase; /* Streaming of two-phase transactions at
> * prepare time */
> + bool only_local; /* publish only locally generated data */
>
> This is a similar review comment as #1.5 about saying the word "generated".
> Maybe there is another way to word this?
Modified
> ~~~
>
> 1.13 src/test/regress/sql/subscription.sql - missing test case
>
> Isn't there a missing test case for ensuring that the new option is boolean?
Added test
Thanks for the comments, the attached v7 patch has the changes for the same.
Regards,
Vignesh
Attachment | Content-Type | Size |
---|---|---|
v7-0001-Skip-replication-of-non-local-data.patch | text/x-patch | 51.6 KB |
v7-0002-Support-force-option-for-copy_data-check-and-thro.patch | text/x-patch | 43.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | vignesh C | 2022-04-08 07:40:05 | Re: Handle infinite recursion in logical replication setup |
Previous Message | Jeff Davis | 2022-04-08 07:27:44 | pgsql: Add contrib/pg_walinspect. |