| From: | Simon Riggs <simon(at)2ndquadrant(dot)com> | 
|---|---|
| To: | Noah Misch <noah(at)leadboat(dot)com> | 
| Cc: | Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: More replication race conditions | 
| Date: | 2017-08-31 11:54:35 | 
| Message-ID: | CANP8+jLDWCD0nVPUNPjquB2A=_PQ6Z=HcgT+ZdLRyzHSaO7U8w@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On 27 August 2017 at 03:32, Noah Misch <noah(at)leadboat(dot)com> wrote:
> On Fri, Aug 25, 2017 at 12:09:00PM +0200, Petr Jelinek wrote:
>> On 24/08/17 19:54, Tom Lane wrote:
>> > sungazer just failed with
>> >
>> > pg_recvlogical exited with code '256', stdout '' and stderr 'pg_recvlogical: could not send replication command "START_REPLICATION SLOT "test_slot" LOGICAL 0/0 ("include-xids" '0', "skip-empty-xacts" '1')": ERROR:  replication slot "test_slot" is active for PID 8913148
>> > pg_recvlogical: disconnected
>> > ' at /home/nm/farm/gcc64/HEAD/pgsql.build/src/test/recovery/../../../src/test/perl/PostgresNode.pm line 1657.
>> >
>> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer&dt=2017-08-24%2015%3A16%3A10
>> >
>> > Looks like we're still not there on preventing replication startup
>> > race conditions.
>>
>> Hmm, that looks like "by design" behavior. Slot acquiring will throw
>> error if the slot is already used by somebody else (slots use their own
>> locking mechanism that does not wait). In this case it seems the
>> walsender which was using slot for previous previous step didn't finish
>> releasing the slot by the time we start new command. We can work around
>> this by changing the test to wait perhaps.
>
> [Action required within three days.  This is a generic notification.]
>
> The above-described topic is currently a PostgreSQL 10 open item.  Simon,
> since you committed the patch believed to have created it, you own this open
> item.  If some other commit is more relevant or if this does not belong as a
> v10 open item, please let us know.  Otherwise, please observe the policy on
> open item ownership[1] and send a status update within three calendar days of
> this message.  Include a date for your subsequent status update.  Testers may
> discover new open items at any time, and I want to plan to get them all fixed
> well in advance of shipping v10.  Consequently, I will appreciate your efforts
> toward speedy resolution.  Thanks.
>
> [1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com
I acknowledge receipt of the reminder and will fix by end of day tomorrow.
-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Magnus Hagander | 2017-08-31 12:19:08 | Re: Function to move the position of a replication slot | 
| Previous Message | Magnus Hagander | 2017-08-31 11:49:37 | log_destination=file |