Re: Logical replication: stuck spinlock at ReplicationSlotRelease

From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Logical replication: stuck spinlock at ReplicationSlotRelease
Date: 2017-06-23 20:10:35
Message-ID: 86a148bd-e606-45fd-6bba-b037507b8978@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 6/23/17 13:26, Alvaro Herrera wrote:
> Albe Laurenz wrote:
>> Peter Eisentraut wrote:
>>> On 6/21/17 09:02, Albe Laurenz wrote:
>>>> 2017-06-21 14:55:12.033 CEST [23124] LOG: could not send data to client: Broken pipe
>>>> 2017-06-21 14:55:12.033 CEST [23124] FATAL: connection to client lost
>>>> 2017-06-21 14:55:17.032 CEST [23133] LOG: logical replication apply worker for subscription "reprec" has started
>>>> DEBUG: received replication command: IDENTIFY_SYSTEM
>>>> DEBUG: received replication command: START_REPLICATION SLOT "reprec" LOGICAL 0/0 (proto_version '1', publication_names '"repsend"')
>>>> 2017-06-21 14:57:24.552 CEST [23124] PANIC: stuck spinlock detected at ReplicationSlotRelease, slot.c:394
>>>> 2017-06-21 14:57:24.885 CEST [23070] LOG: server process (PID 23124) was terminated by signal 6: Aborted
>>>> 2017-06-21 14:57:24.885 CEST [23070] LOG: terminating any other active server processes
>>>> 2017-06-21 14:57:24.887 CEST [23134] LOG: could not send data to client: Broken pipe
>>>> 2017-06-21 14:57:24.890 CEST [23070] LOG: all server processes terminated; reinitializing
>>>
>>> I can't reproduce that. I let it loop around for about 10 minutes and
>>> it was fine.
>>>
>>> I notice that you have some debug settings on. Could you share your
>>> exact setup steps from initdb, as well as configure options, just in
>>> case one of these settings is causing a problem?
>
> Hmm, so for instance in LogicalIncreaseRestartDecodingForSlot() we have
> some elog(DEBUG1) calls with the slot spinlock held. That's probably
> uncool.

I can reproduce the issue with --client-min-messages=debug2 or debug1,
but it doesn't appear with the default settings. I don't always get the
"stuck spinlock" message, but it hangs badly pretty reliably after two
or three rounds of erroring.

Removing the call you pointed out doesn't make a difference, but it's
possibly something similar.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-06-23 20:15:09 Re: Logical replication: stuck spinlock at ReplicationSlotRelease
Previous Message Alvaro Herrera 2017-06-23 20:07:01 Re: pg_terminate_backend can terminate background workers and autovacuum launchers