Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility.

From: Shubham Khanna <khannashubham1197(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility.
Date: 2025-03-10 06:41:06
Message-ID: CAHv8RjKmf5bAcgTmGToWSn_Fx+O2y8qCiLScBXvBTD0D5gX2sw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 6, 2025 at 9:27 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Hi Shubham.
>
> Some review comments for patch v13-0001.
>
> ======
> GENERAL
>
> 1.
> --cleanup-existing-publications
>
> I've never liked this proposed switch name much.
>
> e.g. why say "cleanup" instead of "drop"? What is the difference?
> Saying drop is very explicit about what the switch will do.
> e.g. why say "existing"? It seems redundant; you can't cleanup/drop
> something that does not exist.
>
> My preference is one of:
> --drop-publications
> --drop-all-publications
>
> either of which seem nicely aligned with the descriptions in the usage and docs.
>
> Yeah, I know I have raised this same point before, but AFAICT the
> reply was like "will revise it in the next patch version", but that
> was many versions ago. I think it is important to settle the switch
> name earlier than later because there will be many tentacles into the
> code (vars, params, fields, comments) and docs if anything changes --
> so it is not a decision you want to leave till the end because it
> could destabilise everything at the last minute.
>

I have updated the option name to '--drop-all-publications'.

> ======
> Commit message
>
> 2.
> By default, publications are preserved to avoid unintended data loss.
>
> ~
>
> Was there supposed to be a blank line before this text, or should this
> text be wrapped into the preceding paragraph?
>

Fixed.

> ======
> src/bin/pg_basebackup/pg_createsubscriber.c
>
> setup_subscriber:
>
> 3.
> /*
> * Create the subscriptions, adjust the initial location for logical
> * replication and enable the subscriptions. That's the last step for logical
> - * replication setup.
> + * replication setup. If 'drop_publications' options is true, existing
> + * publications on the subscriber will be dropped before creating new
> + * subscriptions.
> */
>
> There are multiple things amiss with this comment.
> - 'drop_publications' is not the parameter name
> - 'drop_publications' options [sic plural??]. It is not an option
> here; it is a parameter
>

Fixed.

> ~~~
>
> check_and_drop_existing_publications:
>
> 4.
> /*
> - * Remove publication if it couldn't finish all steps.
> + * Check and drop existing publications on the subscriber if requested.
> */
>
> There is no need to say "if requested.". It is akin to saying this
> function does XXX if this function is called.
>

Fixed.

> ~~~
>
> drop_publication_by_name:
>
> 5.
> +/* Drop the specified publication of the given database. */
> +static void
> +drop_publication_by_name(PGconn *conn, const char *pubname, const char *dbname)
>
> 5a.
> I think it is better to define this function before
> check_and_drop_existing_publications. YMMV.
>
> ~
>
> 5b.
> IMO the parameters should be reordered (PGconn *conn, const char
> *dbname, const char *pubname). It seems more natural and would be
> consistent with check_and_drop_existing_publications. YMMV.
>
> ~~~

Fixed.

>
> 6.
> - dbinfo->pubname, dbinfo->dbname, PQresultErrorMessage(res));
> - dbinfo->made_publication = false; /* don't try again. */
> + pubname, dbname, PQresultErrorMessage(res));
> + dbinfos.dbinfo->made_publication = false; /* don't try again. */
>
> Something about this flag assignment seems odd to me. IIUC
> 'made_publications' is for removing the cleanup_objects_atexit to be
> able to remove the special publication implicitly made by the
> pg_createsubscriber. But, AFAIK we can also get to this code via the
> --cleanup-existing-publication switch, so actually it might be one of
> the "existing" publication DROPS that has failed. If so, then the
> "don't try again comment" is misleading because it may not be that
> same publication "again" at all.
>

I agree with your point. The current comment is misleading, as the
failure could be related to an existing publication drop via
--cleanup-existing-publications now --drop-all-publications, not just
the publication created by pg_createsubscriber.
This issue is still open, and I will address it in the next version of
the patch.

> ======
> .../t/040_pg_createsubscriber.pl
>
> GENERAL.
>
> 7.
> Most of the changes to this test code are just changing node name S ->
> S1. It's not much to do with the patch other than tidying it in
> preparation for a new node S2. OTOH it makes the review far harder
> because there are so many changes. IMO all this S->S1 stuff should be
> done as a separate pre-requisite patch and then it will be far easier
> to see what changes are added for the --clean-existing-publications
> testing.
>
> ~~~
>

I understand your concern. Since I am using two nodes (node_s1 and
node_s2), I will work on creating a separate prerequisite patch for
renaming S -> S1. This should make it easier to review the actual
changes related to --cleanup-existing-publications now
--drop-all-publications testing.

> 8.
> # Set up node S as standby linking to node P
> $node_p->backup('backup_1');
> -my $node_s = PostgreSQL::Test::Cluster->new('node_s');
> -$node_s->init_from_backup($node_p, 'backup_1', has_streaming => 1);
> -$node_s->append_conf(
> +my $node_s1 = PostgreSQL::Test::Cluster->new('node_s1');
> +$node_s1->init_from_backup($node_p, 'backup_1', has_streaming => 1);
> +$node_s1->append_conf(
>
> The comment should refer to S1, not S.
>
> ~~~
>

Fixed.

> 9.
> # Set up node C as standby linking to node S
> -$node_s->backup('backup_2');
> +$node_s1->backup('backup_2');
> my $node_c = PostgreSQL::Test::Cluster->new('node_c');
> -$node_c->init_from_backup($node_s, 'backup_2', has_streaming => 1);
> +$node_c->init_from_backup($node_s1, 'backup_2', has_streaming => 1);
>
> The comment should refer to S1, not S.
>
> ~~~
>

Fixed.

> 10.
> # Check some unmet conditions on node S
> -$node_s->append_conf(
> +$node_s1->append_conf(
>
> The comment should refer to S1, not S.
>
> (note... there are lots of these. You should search/fix them all;
> these review comments might miss some)
>
> ~~~
>

Fixed.

> 11.
> + '--socketdir' => $node_s1->host,
> + '--subscriber-port' => $node_s1->port,
> '--database' => $db1,
> '--database' => $db2,
> ],
> 'standby contains unmet conditions on node S');
>
> The message should refer to S1, not S.
>
> (note... there are lots of these. You should search/fix them all;
> these review comments might miss some)
>
> ~~~
>

Fixed.

> 12.
> # dry run mode on node S
> command_ok(
> @@ -338,10 +353,10 @@ command_ok(
> '--verbose',
> '--dry-run',
> '--recovery-timeout' => $PostgreSQL::Test::Utils::timeout_default,
> - '--pgdata' => $node_s->data_dir,
> + '--pgdata' => $node_s1->data_dir,
> '--publisher-server' => $node_p->connstr($db1),
> - '--socketdir' => $node_s->host,
> - '--subscriber-port' => $node_s->port,
> + '--socketdir' => $node_s1->host,
> + '--subscriber-port' => $node_s1->port,
> '--publication' => 'pub1',
> '--publication' => 'pub2',
> '--subscription' => 'sub1',
> @@ -352,10 +367,11 @@ command_ok(
> 'run pg_createsubscriber --dry-run on node S');
> Comment and Message should refer to S1, not S.
>
> ~~~

Fixed.

>
> 13.
> # Check if node S is still a standby
> -$node_s->start;
> -is($node_s->safe_psql('postgres', 'SELECT pg_catalog.pg_is_in_recovery()'),
> - 't', 'standby is in recovery');
> -$node_s->stop;
> +$node_s1->start;
> +is( $node_s1->safe_psql('postgres', 'SELECT pg_catalog.pg_is_in_recovery()'),
> + 't',
> + 'standby is in recovery');
> +$node_s1->stop;
>
> The comment should refer to S1, not S.
>
> ~~~

Fixed.

>
> 14.
> -# Run pg_createsubscriber on node S. --verbose is used twice
> -# to show more information.
> +# Run pg_createsubscriber on node S using '--cleanup-existing-publications'.
> +# --verbose is used twice to show more information.
> # In passing, also test the --enable-two-phase option
> command_ok(
> [
> 'pg_createsubscriber',
> '--verbose', '--verbose',
> '--recovery-timeout' => $PostgreSQL::Test::Utils::timeout_default,
> - '--pgdata' => $node_s->data_dir,
> + '--pgdata' => $node_s1->data_dir,
> '--publisher-server' => $node_p->connstr($db1),
> - '--socketdir' => $node_s->host,
> - '--subscriber-port' => $node_s->port,
> + '--socketdir' => $node_s1->host,
> + '--subscriber-port' => $node_s1->port,
> '--publication' => 'pub1',
> '--publication' => 'pub2',
> '--replication-slot' => 'replslot1',
> '--replication-slot' => 'replslot2',
> '--database' => $db1,
> '--database' => $db2,
> - '--enable-two-phase'
> + '--enable-two-phase',
> + '--cleanup-existing-publications',
> ],
> 'run pg_createsubscriber on node S');
>
> Comment and Message should refer to S1, not S.
>
> ~~~
>

Fixed.

> 15.
> +# Create user-defined publications
> +$node_p->safe_psql($db3, "CREATE PUBLICATION test_pub3 FOR ALL TABLES;");
> +$node_p->safe_psql($db3, "CREATE PUBLICATION test_pub4 FOR ALL TABLES;");
>
> Probably these can be combined.
>
> ~~~
>

Fixed.

> 16.
> +# Drop the newly created publications
> +$node_p->safe_psql($db3, "DROP PUBLICATION IF EXISTS test_pub3;");
> +$node_p->safe_psql($db3, "DROP PUBLICATION IF EXISTS test_pub4;");
> +
>
> Probably these can be combined.
>

Fixed.

The attached patch at [1] contains the required changes.

[1] - https://www.postgresql.org/message-id/CAHv8RjLCZyzFMGR8SBdvSocRZGAAr_eRd4ht48kXCp9oEaKQeQ%40mail.gmail.com

Thanks and regards,
Shubham Khanna.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2025-03-10 06:58:43 Re: SQLFunctionCache and generic plans
Previous Message Shubham Khanna 2025-03-10 06:30:50 Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility.