Re: pg_basebackup error: replication slot "pg_basebackup_2194" already exists

From: Ludovic Vaugeois-Pepin <ludovicvp(at)gmail(dot)com>
To: PostgreSQL mailing lists <pgsql-general(at)postgresql(dot)org>
Subject: Re: pg_basebackup error: replication slot "pg_basebackup_2194" already exists
Date: 2017-05-30 22:20:22
Message-ID: CAAJDx8MVPgPitWqCSbvR9tb7ednvRpgi4-QOE=h0Q6003D_C9Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Tue, May 30, 2017 at 9:32 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> On Tue, May 30, 2017 at 9:14 PM, Ludovic Vaugeois-Pepin
> <ludovicvp(at)gmail(dot)com> wrote:
>>
>> I ran into the issue described below with 10.0 beta. The error I got is:
>>
>> pg_basebackup: could not create temporary replication slot
>> "pg_basebackup_2194": ERROR: replication slot "pg_basebackup_2194"
>> already exists
>>
>> A race condition? Or maybe I am doing something wrong.
>>
>>
>>
>>
>>
>> Release:
>> Name : postgresql10-server
>> Version : 10.0
>> Release : beta1PGDG.rhel7
>>
>>
>> Test Type:
>> Functional testing of a pacemaker resource agent
>> (https://github.com/ulodciv/pgha)
>>
>>
>> Test Detail:
>> During context/environement setup, pg_basebackup is invoked (in
>> parallel) from multiple virtual machines. The backups are then started
>> as asynchronously replicated hot standbies.
>>
>>
>> Platform:
>> Centos 7.3
>>
>>
>> Installation Method:
>> yum -y install
>>
>> https://download.postgresql.org/pub/repos/yum/testing/10/redhat/rhel-7-x86_64/pgdg-redhat10-10-1.noarch.rpm
>> yum -y install postgresql10-server postgresql10-contrib
>>
>>
>> Platform Detail:
>>
>>
>> Test Procedure:
>>
>> Have pg_basebackup run simultaneously on multiple hosts against
>> the same instance eg:
>>
>> pg_basebackup -h test4 -p 5432 -D /var/lib/pgsql/10/data -U repl1
>> -Xs
>>
>>
>> Failure?
>>
>> E deploylib.deployer_error.DeployerError:
>> postgres(at)test5: got exit status 1 for:
>> E pg_basebackup -h test4 -p 5432 -D
>> /var/lib/pgsql/10/data -U repl1 -Xs
>> E stderr: pg_basebackup: could not create temporary
>> replication slot "pg_basebackup_2194": ERROR: replication slot
>> "pg_basebackup_2194" already exists
>> E pg_basebackup: child process exited with error 1
>> E pg_basebackup: removing data directory
>> "/var/lib/pgsql/10/data"
>>
>>
>> Test Results:
>>
>>
>> Comments:
>> This seems to be new with 10. I recently began testing the
>> pacemaker resource agent against PG 10. I never had (or noticed) this
>> failure with 9.6.1 and 9.6.2.
>
>
> Hah, that's an interesting failure. In the name of the slot, the 2194 comes
> from the pid -- but it's the pid of pg_basebackup.
>
> I assume you're not running the two pg_basebackup processes on the same
> machine? Is it predictable when this happens (meaning that the pid value is
> actually predictable), or do you have to run it a large numbe rof times
> before it happens?

Indeed, I run it from two VMs that were created from the same .ova
(packaged VM).
I ran into this once, however I have been running tests on 10.0 for a
couple of days or so.

My guess is that the two hosts ended up using the same pid when
running the backup.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Patrick B 2017-05-31 04:17:03 Regexp + spaces PG 9.1
Previous Message Ludovic Vaugeois-Pepin 2017-05-30 22:16:37 Fwd: pg_basebackup error: replication slot "pg_basebackup_2194" already exists

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2017-05-30 22:21:10 Re: Segmentation fault when creating a BRIN, 10beta1
Previous Message Ludovic Vaugeois-Pepin 2017-05-30 22:16:37 Fwd: pg_basebackup error: replication slot "pg_basebackup_2194" already exists