From: | "Euler Taveira" <euler(at)eulerto(dot)com> |
---|---|
To: | "'Shubham Khanna'" <khannashubham1197(at)gmail(dot)com>, "Ashutosh Bapat" <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
Cc: | "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "Nisha Moond" <nisha(dot)moond412(at)gmail(dot)com>, "Ajin Cherian" <itsajin(at)gmail(dot)com>, "Peter Smith" <smithpb2250(at)gmail(dot)com>, "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "vignesh C" <vignesh21(at)gmail(dot)com>, "Shlok Kyal" <shlok(dot)kyal(dot)oss(at)gmail(dot)com> |
Subject: | Re: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided. |
Date: | 2025-03-25 20:34:16 |
Message-ID: | e32c358b-95b5-426c-9baa-281812821588@app.fastmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Mar 25, 2025, at 8:07 AM, Shubham Khanna wrote:
> Apologies for the oversight. I have attached the patches now. Please
> find them included here.
I started looking at this patch. When I started reading the commit message, I
didn't understand why the options that provide names to objects are
incompatible with --all option. I agree that --all and --database should be
incompatible but the others shouldn't. We already have a check saying that the
number of specified objects cannot be different from the number of databases.
Why don't rely on it instead of forbidding these options?
/* Number of object names must match number of databases */
if (num_pubs > 0 && num_pubs != num_dbs)
{
pg_log_error("wrong number of publication names specified");
pg_log_error_detail("The number of specified publication names (%d) must match the number of specified database names (%d).",
num_pubs, num_dbs);
exit(1);
}
The following documentation is inaccurate since it doesn't say you should be
allowed to connect to the database too.
+ <para>
+ Create one subscription per all non-template databases on the target
+ server. Automatically generated names for subscriptions, publications,
My suggestion is:
Create one subscription per database on the target server. Exceptions are
template databases and databases that don't allow connections.
You are mixing short (-a) and long option (--all) on the same paragraph. It
might confuse the reader.
+ switches. This option cannot be used together with <option>--all</option>.
+ If <option>-d</option> option is not provided, the database name will be
+ obtained from <option>-P</option> option. If the database name is not
+ specified in either the <option>-d</option> option, or the
+ <option>-P</option> option, and <option>-a</option> option is not
+ specified, an error will be reported.
Since there is only short options, you should also use it.
+ bool all_dbs; /* --all option was specified */
SimpleStringList objecttypes_to_remove; /* list of object types to remove */
This comment doesn't follow the same pattern as the other members. It is using
a sentence. Maybe '--all option' but it would be different from the last added
option: enable-two-phase.
I'm already thinking about translation so it sounds better if you rephrase this
description. Even if it is not precise (but that's what it is expected since if
you cannot connect to a database, it won't be possible to setup a logical
replication for it.)
+ printf(_(" -a, --all create subscriptions for all non-template source databases\n"));
Suggestion:
create subscriptions for all databases except template databases
The following comment is not accurate. The postgres database is not created by
user and will be fetched.
+/*
+ * If --all is specified, fetch a list of all user-created databases from the
+ * source server. Internally, this is treated as if the user specified multiple
+ * --database options, one for each source database.
+ */
+static void
+fetch_source_databases(struct CreateSubscriberOptions *opt)
There is no 'source' terminology in the other function names. It uses
publisher, subscriber, primary and standby. There is also other functions using
'get' so I would use get_publisher_databases as function name.
It is just a matter of style but since the columns you are using are booleans,
it sounds natural to not specify the equality operator. You should also test
datconnlimit to avoid invalid databases. I would add ORDER BY for
predictability.
+ res = PQexec(conn, "SELECT datname FROM pg_database WHERE datistemplate = false AND datallowconn = true");
Something like:
SELECT datname FROM pg_database WHERE datistemplate AND datallowconn AND datconnlimit <> -2 ORDER BY 1
What happens if you don't specify the dbname in -P option?
+ /* Establish a connection to the source server */
+ conn = connect_database(opt->pub_conninfo_str, true);
It defaults to user name. If you are using another superuser (such as admin)
the connection might fail if there is no database named 'admin'. vacuumdb that
has a similar --all option, uses another option (--maintenance-db) to discover
which databases should be vacuumed. I'm not suggesting to add the
--maintenance-db option. I wouldn't like to hardcode a specific database
(template1, for example) if there is no dbname in the connection string.
Instead, suggest the user to specify dbname into -P option. An error message
should be provided saying the exact reason: no dbname was specified.
I don't think both comments add anything. You already explained before the
function body.
+ /* Process the query result */
+ for (int i = 0; i < PQntuples(res); i++)
+ {
+ const char *dbname = PQgetvalue(res, i, 0);
+
+ simple_string_list_append(&opt->database_names, dbname);
+
+ /* Increment num_dbs to reflect multiple --database options */
+ num_dbs++;
+ }
I wouldn't add an error here.
+ /* Error if no databases were found on the source server */
+ if (num_dbs == 0)
+ {
+ pg_log_error("no suitable databases found on the source server");
+ pg_log_error_hint("Ensure that there are non-template and connectable databases on the source server.");
+ PQclear(res);
+ disconnect_database(conn, true);
+ }
It already has a central place to provide no-database-provided errors. Use it.
if (opt.database_names.head == NULL)
{
...
}
The following code is not accurate. If I specify --all, --database and
--subscription, it will report only --database. The user will remove it and run
again. At this time, --subscription will be report. My expectation is to have
all reports at once.
+ /* Validate that --all is not used with incompatible options */
+ if (opt.all_dbs)
+ {
+ char *bad_switch = NULL;
+
+ if (num_dbs > 0)
+ bad_switch = "--database";
+ else if (num_pubs > 0)
+ bad_switch = "--publication";
+ else if (num_replslots > 0)
+ bad_switch = "--replication-slot";
+ else if (num_subs > 0)
+ bad_switch = "--subscription";
+
+ if (bad_switch)
+ {
+ pg_log_error("%s cannot be used with --all", bad_switch);
+ pg_log_error_hint("Try \"%s --help\" for more information.", progname);
+ exit(1);
+ }
+ }
According to my suggestion above, There won't be 'else if' and the new function
has an independent condition flow.
- if (opt.database_names.head == NULL)
+ /*
+ * Fetch all databases from the source (publisher) if --all is specified.
+ */
+ if (opt.all_dbs)
+ fetch_source_databases(&opt);
+
+ else if (opt.database_names.head == NULL)
I checked the applications that provide multiple synopsis using the following command.
grep '<cmdsynopsis' doc/src/sgml/ref/*.sgml | awk '{print $1}' | sort | uniq -c
There are just 3 applications that have multiple cmdsynopsis. pgbench has 2
distinct tasks (load and run) so it makes sense to have 2 synopsis. pg_ctl has
multiple tasks and also deserves multiple synopsis. The complexity introduced
into vacuumdb (per table, per schema) seems to justify multiple synopsis too.
However, the same logic doesn't apply here. IMO 0002 shouldn't be applied
because the additional option (--all) doesn't justify multiple synopsis for
syntax clarification.
I have the same opinion as Amit. I wouldn't apply 0003.
--
Euler Taveira
EDB https://www.enterprisedb.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Lilian Ontowhee | 2025-03-25 20:48:23 | Re: [fixed] Trigger test |
Previous Message | Tomas Vondra | 2025-03-25 20:31:42 | Re: Advanced Patch Feedback Session / pgconf.dev 2025 |