Re: speed up a logical replica setup

From: vignesh C <vignesh21(at)gmail(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: Euler Taveira <euler(at)eulerto(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>
Subject: Re: speed up a logical replica setup
Date: 2024-02-22 16:29:27
Message-ID: CALDaNm2GgLWfMxE06dWcZUsoEtqK_3+_aAVFvtW6DVr9-NscyQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 22 Feb 2024 at 21:17, Hayato Kuroda (Fujitsu)
<kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
> Dear Vignesh,
>
> > Few comments on the tests:
> > 1) If the dry run was successful because of some issue then the server
> > will be stopped so we can check for "pg_ctl status" if the server is
> > running otherwise the connection will fail in this case. Another way
> > would be to check if it does not have "postmaster was stopped"
> > messages in the stdout.
> > +
> > +# Check if node S is still a standby
> > +is($node_s->safe_psql('postgres', 'SELECT pg_catalog.pg_is_in_recovery()'),
> > + 't', 'standby is in recovery');
>
> Just to confirm - your suggestion is to add `pg_ctl status`, right? Added.

Yes, that is correct.

> > 2) Can we add verification of "postmaster was stopped" messages in
> > the stdout for dry run without --databases testcase
> > +# pg_createsubscriber can run without --databases option
> > +command_ok(
> > + [
> > + 'pg_createsubscriber', '--verbose',
> > + '--dry-run', '--pgdata',
> > + $node_s->data_dir, '--publisher-server',
> > + $node_p->connstr('pg1'), '--subscriber-server',
> > + $node_s->connstr('pg1')
> > + ],
> > + 'run pg_createsubscriber without --databases');
> > +
>
> Hmm, in case of --dry-run, the server would be never shut down.
> See below part.
>
> ```
> if (!dry_run)
> stop_standby_server(pg_ctl_path, opt.subscriber_dir);
> ```

One way to differentiate whether the server is run in dry_run mode or
not is to check if the server was stopped or not. So I mean we can
check that the stdout does not have a "postmaster was stopped" message
from the stdout. Can we add validation based on this code:
+ if (action)
+ pg_log_info("postmaster was started");

Or another way is to check pg_ctl status to see that the server is not shutdown.

> > 3) This message "target server must be running" seems to be wrong,
> > should it be cannot specify cascading replicating standby as standby
> > node(this is for v22-0011 patch :
> > + 'pg_createsubscriber', '--verbose', '--pgdata',
> > $node_c->data_dir,
> > + '--publisher-server', $node_s->connstr('postgres'),
> > + '--port', $node_c->port, '--socketdir', $node_c->host,
> > + '--database', 'postgres'
> > ],
> > - 'primary server is in recovery');
> > + 1,
> > + [qr//],
> > + [qr/primary server cannot be in recovery/],
> > + 'target server must be running');
>
> Fixed.
>
> > 4) Should this be "Wait for subscriber to catch up"
> > +# Wait subscriber to catch up
> > +$node_s->wait_for_subscription_sync($node_p, $subnames[0]);
> > +$node_s->wait_for_subscription_sync($node_p, $subnames[1]);
>
> Fixed.
>
> > 5) Should this be 'Subscriptions has been created on all the specified
> > databases'
> > +);
> > +is($result, qq(2),
> > + 'Subscriptions has been created to all the specified databases'
> > +);
>
> Fixed, but "has" should be "have".
>
> > 6) Add test to verify current_user is not a member of
> > ROLE_PG_CREATE_SUBSCRIPTION, has no create permissions, has no
> > permissions to execution replication origin advance functions
> >
> > 7) Add tests to verify insufficient max_logical_replication_workers,
> > max_replication_slots and max_worker_processes for the subscription
> > node
> >
> > 8) Add tests to verify invalid configuration of wal_level,
> > max_replication_slots and max_wal_senders for the publisher node
>
> Hmm, but adding these checks may increase the test time. we should
> measure the time and then decide.

We can check and see if it does not take significantly more time, then
we can have these tests.

Regards,
Vignesh

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2024-02-22 16:36:00 Re: Sequence Access Methods, round two
Previous Message Tom Lane 2024-02-22 16:20:54 Re: bug report: some issues about pg_15_stable(8fa4a1ac61189efffb8b851ee77e1bc87360c445)