Re: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided.

From: Shubham Khanna <khannashubham1197(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided.
Date: 2025-03-21 13:29:21
Message-ID: CAHv8RjLj0KxVHbxaPZHzudGS1igzDMccFE8LDh4LHNJR_2Aqug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 20, 2025 at 4:53 PM Ashutosh Bapat
<ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
>
> On Thu, Mar 20, 2025 at 10:25 AM Shubham Khanna
> <khannashubham1197(at)gmail(dot)com> wrote:
>
> >
> > The attached patches contain the suggested changes.
> >
>
> I have started reviewing the patches again. Here are some review comments
>
> <variablelist>
> + <varlistentry>
> + <term><option>-a</option></term>
> + <term><option>--all</option></term>
> + <listitem>
> + <para>
> + For all source server non-template databases create subscriptions for
> + databases with the same names on the target server.
>
> In this construct "all" goes with "source server" but it should go
> with "non-template database". How about, "For every non-template
> database on the source server, create one subscription on the target
> server in the database with the same name."?
>

Fixed.

> + Subscription names, publication names, and replication slot names are
> + automatically generated. This option cannot be used together with
> + <option>--database</option>, <option>--publication</option>,
> + <option>--replication-slot</option> or <option>--subscription</option>.
>
> ... used along with ...
>
> also comma before or .
>
> We should also mention that generated names of replication slots,
> publications and subscriptions are used when using this option.
>
> + </para>
> + </listitem>
> + </varlistentry>
> +
> <varlistentry>
> <term><option>-d <replaceable
> class="parameter">dbname</replaceable></option></term>
> <term><option>--database=<replaceable
> class="parameter">dbname</replaceable></option></term>

Fixed.

> @@ -94,9 +130,10 @@ PostgreSQL documentation
> <para>
> The name of the database in which to create a subscription. Multiple
> databases can be selected by writing multiple <option>-d</option>
> - switches. If <option>-d</option> option is not provided, the database
> - name will be obtained from <option>-P</option> option. If the database
> - name is not specified in either the <option>-d</option> option or
> + switches. This option cannot be used together with
> <option>--all</option>.
> + If <option>-d</option> option is not provided, the database name will be
> + obtained from <option>-P</option> option. If the database name is not
> + specified in either the <option>-d</option> option or
> <option>-P</option> option, an error will be reported.
>
> We have to mention -a as well here. Something like "If the database
> name is not specified in either the <option>-d</option> option or
> <option>-P</option> option, and <option>-a</option> is not provided,
> an error will be reported."
>

Fixed.

> +# run pg_createsubscriber with '--database' and '--all' without '--dry-run'
> +# and verify the failure
> +command_fails_like(
> + [
> + 'pg_createsubscriber',
> + '--verbose',
> + '--pgdata' => $node_s->data_dir,
> + '--publisher-server' => $node_p->connstr($db1),
> + '--socketdir' => $node_s->host,
> + '--subscriber-port' => $node_s->port,
> + '--database' => $db1,
> + '--all',
> + ],
> + qr/--database cannot be used with --all/,
> + 'fail if --database is used with --all');
>
>
> Why does only this test not use --dry-run, but all other such tests
> use --dry-run? Either they should all use it or not use it.
>
> +
> +# run pg_createsubscriber with '--publication' and '--all' and verify
> +# the failure
> +command_fails_like(
> + [
> + 'pg_createsubscriber',
> + '--verbose',
> + '--dry-run',
> + '--pgdata' => $node_s->data_dir,
> + '--publisher-server' => $node_p->connstr($db1),
> + '--socketdir' => $node_s->host,
> + '--subscriber-port' => $node_s->port,
> + '--all',
> + '--publication' => 'pub1',
> + ],
> + qr/--publication cannot be used with --all/,
> + 'fail if --publication is used with --all');
> +
> +# run pg_createsubscriber with '--replication-slot' and '--all' and
> +# verify the failure
> +command_fails_like(
> + [
> + 'pg_createsubscriber',
> + '--verbose',
> + '--dry-run',
> + '--pgdata' => $node_s->data_dir,
> + '--publisher-server' => $node_p->connstr($db1),
> + '--socketdir' => $node_s->host,
> + '--subscriber-port' => $node_s->port,
> + '--replication-slot' => 'replslot1',
> + '--all',
> + ],
> + qr/--replication-slot cannot be used with --all/,
> + 'fail if --replication-slot is used with --all');
> +
> +# run pg_createsubscriber with '--subscription' and '--all' and
> +# verify the failure
> +command_fails_like(
> + [
> + 'pg_createsubscriber',
> + '--verbose',
> + '--dry-run',
> + '--pgdata' => $node_s->data_dir,
> + '--publisher-server' => $node_p->connstr($db1),
> + '--socketdir' => $node_s->host,
> + '--subscriber-port' => $node_s->port,
> + '--all',
> + '--subscription' => 'sub1',
> + ],
> + qr/--subscription cannot be used with --all/,
> + 'fail if --subscription is used with --all');
> +
>
> Just to cover all combinations we need to test both scenarios. where
> --all comes before an incompatible option and vice versa.
>
> # Run pg_createsubscriber on node S. --verbose is used twice
> # to show more information.
> # In passing, also test the --enable-two-phase option
> @@ -459,10 +526,93 @@ my $sysid_s = $node_s->safe_psql('postgres',
> 'SELECT system_identifier FROM pg_control_system()');
> ok($sysid_p != $sysid_s, 'system identifier was changed');
>
> +# On node P create test tables
> +$node_p->safe_psql('postgres', 'CREATE TABLE tbl1 (a text)');
> +$node_p->safe_psql($db1, 'CREATE TABLE tbl2 (a text)');
> +$node_p->safe_psql($db1, "INSERT INTO tbl2 VALUES('first row')");
> +$node_p->safe_psql($db2, 'CREATE TABLE tbl3 (a text)');
>
> $db1 and $db2 already have tables. We can avoid creating new ones here.
>

Fixed.

> +
> +# Wait subscriber to catch up
> +$node_u->wait_for_subscription_sync($node_p, $subnames1[0]);
> +$node_u->wait_for_subscription_sync($node_p, $subnames1[1]);
> +$node_u->wait_for_subscription_sync($node_p, $subnames1[2]);
> +
>
> This function just makes sure that initial sync is done. But it
> doesn't wait for replication to catch up with current state of the
> publisher. It's the latter which would make sure that the last INSERTS
> are visible. We should be using wait_for_slot_catchup() on the
> publisher.
>
> +# Check result in database 'postgres' of node U
> +$result = $node_u->safe_psql('postgres', 'SELECT * FROM tbl1');
> +is($result, qq(first row), "logical replication works in database postgres");
> +
> +# Check result in database $db1 of node U
> +$result = $node_u->safe_psql($db1, 'SELECT * FROM tbl2');
> +is( $result, qq(first row
> +second row),
> + "logical replication works in database $db1");
> +
> +# Check result in database $db2 of node U
> +$result = $node_u->safe_psql($db2, 'SELECT * FROM tbl3');
> +is($result, qq(first row), "logical replication works in database $db2");
> +
>
> These tests may not always pass if we use wait_for_slot_catchup above.
>

During the recent testing, I observed that the tests were failing when
using wait_for_slot_catchup(). To address this, I reverted to using
wait_for_subscription_sync(), which was employed previously and has
proven to be more reliable in ensuring test stability.
Please let me know if there are any additional adjustments you would
suggest or if you would like me to investigate further into
wait_for_slot_catchup().

I have created a separate patch for the synopsis of '--all' option as
suggested by Amit at [1]. The attached patch contains the suggested
changes.

[1] - https://www.postgresql.org/message-id/CAA4eK1KUDEO0t6i16_CcEpg33sgcgEddHcdVC_q8j4tVUb5FWw%40mail.gmail.com

Thanks and regards,
Shubham Khanna.

Attachment Content-Type Size
v17-0001-Enhance-pg_createsubscriber-to-fetch-and-append-.patch application/octet-stream 12.2 KB
v17-0002-Additional-test-cases.patch application/octet-stream 5.4 KB
v17-0003-Synopsis-for-all-option.patch application/octet-stream 1.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shubham Khanna 2025-03-21 13:36:00 Re: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided.
Previous Message Sami Imseih 2025-03-21 13:25:24 Re: Proposal - Allow extensions to set a Plan Identifier