Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Ian Lawrence Barwick <barwick(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Have pg_basebackup write "dbname" in "primary_conninfo"?
Date: 2024-03-14 00:26:28
Message-ID: CAD21AoD6sOVX5UFhSWm_bBNAPv65=+f46JsTWjuD3ESiWKpbGg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 23, 2024 at 3:05 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Feb 21, 2024 at 7:46 AM Hayato Kuroda (Fujitsu)
> <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
> >
> > > > Just FYI - here is an extreme case. And note that I have applied proposed patch.
> > > >
> > > > When `pg_basebackup -D data_N2 -R` is used:
> > > > ```
> > > > primary_conninfo = 'user=hayato ... dbname=hayato ...
> > > > ```
> > > >
> > > > But when `pg_basebackup -d "" -D data_N2 -R` is used:
> > > > ```
> > > > primary_conninfo = 'user=hayato ... dbname=replication
> > > > ```
> > >
> > > It seems like maybe somebody should look into why this is happening,
> > > and perhaps fix it.
> >
> > I think this caused from below part [1] in GetConnection().
> >
> > If both dbname and connection_string are the NULL, we will enter the else part
> > and NULL would be substituted - {"dbnmae", NULL} key-value pair is generated
> > only here.
> >
> > Then, in PQconnectdbParams()->PQconnectStartParams->pqConnectOptions2(),
> > the strange part would be found and replaced to the username [2].
> >
> > I think if both the connection string and the dbname are NULL, it should be
> > considered as the physical replication connection. here is a patch to fix it.
> >
>
> When dbname is NULL or not given, it defaults to username. This
> follows the specs of the connection string.

This fact makes me think that the slotsync worker might be able to
accept the primary_conninfo value even if there is no dbname in the
value. That is, if there is no dbname in the primary_conninfo, it uses
the username in accordance with the specs of the connection string.
Currently, the slotsync worker connects to the local database first
and then establishes the connection to the primary server. But if we
can reverse the two steps, it can get the dbname that has actually
been used to establish the remote connection and use it for the local
connection too. That way, the primary_conninfo generated by
pg_basebackup could work even without the patch. For example, if the
OS user executing pg_basebackup is 'postgres', the slotsync worker
would connect to the postgres database. Given the 'postgres' database
is created by default and 'postgres' OS user is used in common, I
guess it could cover many cases in practice actually.

Having said that, even with (or without) the above change, we might
want to change the pg_basebackup so that it writes the dbname to the
primary_conninfo if -R option is specified. Since the database where
the slotsync worker connects cannot be dropped while the slotsync
worker is running, the user might want to change the database to
connect, and it would be useful if they can do that using
pg_basebackup instead of modifying the configuration file manually.

While the current approach makes sense to me, I'm a bit concerned that
we might end up having the pg_basebackup search the actual database
name (e.g. 'dbname=template1') from the .pgpass file instead of
'dbname=replication'. As far as I tested on my environment, suppose
that I execute:

pg_basebackup -D tmp -d "dbname=testdb" -R

The pg_basebackup established a replication connection but looked for
the password of the 'testdb' database. This could be another
inconvenience for the existing users who want to use the slot
synchronization.

A random idea I came up with is, we add a new option to the
pg_basebackup to overwrite the full or some portion of the connection
string that is eventually written in the primary_conninfo in
postgresql.auto.conf. For example, the command:

pg_basebackup -D tmp -d "host=1.1.1.1 port=5555" -R
--primary-coninfo-ext "host=2.2.2.2 dbname=postgres"

will produce the connection string that is based on -d option value
but is overwritten by --primary-conninfo-ext option value, which will
be like:

host=2.2.2.2 dbname=postgres port=5555

This option might help not only for users who want to use the slotsync
worker but also for users who want to take a basebackup from a standby
but have the new standby connect to the primary.

But it's still just an idea and I might be missing something. And
given we're getting closer to the feature freeze, it would be a PG18
item.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-03-14 00:40:29 Re: Sequence Access Methods, round two
Previous Message Michael Paquier 2024-03-14 00:26:25 Re: Improve readability by using designated initializers when possible