Re: pg_basebackup behavior on non-existent slot

From: "Gerard H(dot) Pille" <ghpille(at)hotmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_basebackup behavior on non-existent slot
Date: 2021-08-23 14:51:53
Message-ID: AM8PR04MB78732D3F8A4BF56A989A8953BCC49@AM8PR04MB7873.eurprd04.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs


Hallo,

my "use case" was forgetting the "-C" option whilst creating a rather large standby database over a slow network connection. I didn't interrupt it since it seemed to continue without problem until al the work was done, half a day later.

Switching to a gigabit connection allowed me to repeat this much faster, and find a solution. I'm just learning about replication on Postgres.

If this is too complex to fix, perhaps adding a warning to the error message:

"pg_basebackup: error: could not send replication command, replication will fail" ?

Thanks!

Gerard

________________________________________
Van: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Verzonden: maandag 23 augustus 2021 5:54
Aan: Gerard H. Pille
CC: PostgreSQL mailing lists
Onderwerp: Re: pg_basebackup behavior on non-existent slot

Hi,

On Sat, Aug 21, 2021 at 10:15 PM Gerard H. Pille <ghpille(at)hotmail(dot)com> wrote:
>
> Hallo,
>
> I was just confronted with this behaviour described back in 2017, and
> the participants in the thread seem to consider it a bug. But I'm
> running on version 13. So, how did that discussion end?
>
> My case:
> @ pg_basebackup -h 192.168.1.131 -D $PWD/13/main -U repuser -v -P -S stream
> Password:
> pg_basebackup: initiating base backup, waiting for checkpoint to complete
> pg_basebackup: checkpoint completed
> pg_basebackup: write-ahead log start point: 23/4E000028 on timeline 1
> pg_basebackup: starting background WAL receiver
> pg_basebackup: error: could not send replication command
> "START_REPLICATION": ERROR: replication slot "stream" does not exist
> 53790996/53790996 kB (100%), 2/2 tablespaces
>
> pg_basebackup: write-ahead log end point: 23/4E000138
> pg_basebackup: waiting for background process to finish streaming ...
> pg_basebackup: error: child process exited with exit code 1
> pg_basebackup: removing data directory "/var/lib/postgresql/13/main"
> pg_basebackup: changes to tablespace directories will not be undone
>
> The old thread:
> https://postgrespro.com/list/thread-id/2337189#CAMkU=1wSxYBNFY9TzuVh3=mDLr4BBsMct6wcViNMH+-6Xon4Uw(at)mail(dot)gmail(dot)com

It seems it's not fixed yet even in HEAD as far as I tested. There
were some ideas to fix that on that thread but the main point was how
to fix it on Windows. I guess that since it creates a transient slot
it’s not a common case to specify a non-existence slot in pg_baseback
but what is your use case? This might help motivate to fix this issue.

BTW in that thread, there was a discussion on how to detect the
streamer process failure in the main process but probably we can fix
this by just doing an existence check for the specified name
replication slot before starting the streamer process?

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Andrey Borodin 2021-08-23 17:38:00 Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data
Previous Message PG Bug reporting form 2021-08-23 09:42:11 BUG #17156: pg_restore: [custom archiver] WARNING: ftell mismatch with expected position -- ftell used