Re: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided.

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Shubham Khanna <khannashubham1197(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided.
Date: 2025-02-20 07:53:26
Message-ID: CAHut+PsatTfk9-F4JBrX_yYE0QGh4wQiTmvS4=dnBxcL=AK2HA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Shubham, Here are some review comments for v9-0001.

======
Commit message

1.
You've changed the option to '--all' but the comment message still
refers to '-all-databases'

======
doc/src/sgml/ref/pg_createsubscriber.sgml

2.
+ <para>
+ Automatically fetch all non-template databases from the source server
+ and create subscriptions for databases with the same names on the
+ target server.

I don't see what is "automatic" about this, nor do I know really what
"fetch" of databases is trying to convey.

Isn't it simpler to just say like below?

SUGGESTION
For all source server non-template databases create subscriptions for...

======
src/bin/pg_basebackup/pg_createsubscriber.c

3.
int recovery_timeout; /* stop recovery after this time */
+ bool all; /* --all option was specified */
};

IMO 'all' is too generic for a field name; it has almost no meaning --
all what? I think the original 'all_databases' name was better. Or
'all_dbs', but not just 'all'.

~~~

4.
+ if (opt.all)
+ {
+ if (num_dbs > 0)
+ {
+ pg_log_error("%s cannot be used with %s", "--all", "--database");
+ pg_log_error_hint("Try \"%s --help\" for more information.", progname);
+ exit(1);
+ }
+
+ if (num_pubs > 0)
+ {
+ pg_log_error("%s cannot be used with %s", "--all", "--publication");
+ pg_log_error_hint("Try \"%s --help\" for more information.", progname);
+ exit(1);
+ }
+
+ if (num_replslots > 0)
+ {
+ pg_log_error("%s cannot be used with %s", "--all", "--replication-slot");
+ pg_log_error_hint("Try \"%s --help\" for more information.", progname);
+ exit(1);
+ }
+
+ if (num_subs > 0)
+ {
+ pg_log_error("%s cannot be used with %s", "--all", "--subscription");
+ pg_log_error_hint("Try \"%s --help\" for more information.", progname);
+ exit(1);
+ }

4a
If bad combinations exist then IMO it is most likely that the --all
came before the incompatible option. So the parameters should be the
other way around so the result is "--<XXX> cannot be used with --all".

~

4b.
You could eliminate all this code duplication by using a variable for
the bad option,

SUGGESTION
char *bad_switch = NULL;

if (num_dbs > 0) bad_switch = "--database";
else if (num_pubs > 0) bad_switch = "--publication";
else if (num_replslots > 0) bad_switch = "--replication_slot";
else if (num_subs > 0) bad_switch = "--subscription";

if (bad_switch)
{
pg_log_error("%s cannot be used with --all", bad_switch);
pg_log_error_hint("Try \"%s --help\" for more information.", progname);
exit(1);
}

~

4c.
There is a double-blank line after this code fragment.

======
.../t/040_pg_createsubscriber.pl

5.
+# run pg_createsubscriber with '--database' and '--all' and verify the
+# failure
+command_fails_like(
+ [
+ 'pg_createsubscriber',
+ '--verbose',
+ '--dry-run',
+ '--pgdata' => $node_s->data_dir,
+ '--publisher-server' => $node_p->connstr($db1),
+ '--socketdir' => $node_s->host,
+ '--subscriber-port' => $node_s->port,
+ '--database' => $db1,
+ '--all',
+ ],
+ qr/--all cannot be used with --database/,
+ 'fail if --all is used with --database');
+

There is no difference anymore in the logic/error if the options are
given in order "--all --database" or if they are in order "--database
--all". So you no longer need to have separate test cases for
different ordering.

~~~

6.
+# Verify that only user databases got subscriptions (not template databases)
+my @user_dbs = ($db1, $db2);
+foreach my $dbname (@user_dbs)
+{
+ my $sub_exists =
+ $node_s1->safe_psql($dbname, "SELECT count(*) FROM pg_subscription;");
+ is($sub_exists, '3', "Subscription created successfully for $dbname");
+}

AFAICT this is just checking a global subscription count in a loop so
it will be 3, 3, 3. Why?

I guess you intended to verify that relevant subscriptions were
created for each of the target databases. But this code is not doing
that.

~~~

7.
+# Verify replication is working
+$node_p->safe_psql($db1, "INSERT INTO tbl1 VALUES ('replication test');");
+
+$result = $node_s1->safe_psql($db1, "SELECT * FROM tbl1 ORDER BY 1");
+is( $result, qq(first row
+replication test
+second row), "replication successful in $db1");
+
+$node_s1->stop;
# Run pg_createsubscriber on node S. --verbose is used twice
# to show more information.
command_ok(
@@ -431,8 +590,9 @@ $result = $node_s->safe_psql($db1,
is($result, qq(0), 'failover slot was removed');

# Check result in database $db1
-$result = $node_s->safe_psql($db1, 'SELECT * FROM tbl1');
+$result = $node_s->safe_psql($db1, 'SELECT * FROM tbl1 ORDER BY 1');
is( $result, qq(first row
+replication test

I concur with Kuroda-san's comment. Maybe remove all this, because
AFAICT existing code was already testing that replication is working
ok for $db1 and $db2.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2025-02-20 08:02:42 Re: Add Pipelining support in psql
Previous Message Zhijie Hou (Fujitsu) 2025-02-20 07:20:27 RE: Conflict detection for update_deleted in logical replication