Re: logical replication busy-waiting on a lock

From: Andres Freund <andres(at)anarazel(dot)de>
To: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>,pgsql-hackers(at)postgresql(dot)org,Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Subject: Re: logical replication busy-waiting on a lock
Date: 2017-05-29 19:23:04
Message-ID: 50A4E5BE-3E68-4BA0-AEEA-8F984D373F9A@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On May 29, 2017 12:21:50 PM PDT, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com> wrote:
>On 29/05/17 20:59, Andres Freund wrote:
>>
>>
>> On May 29, 2017 11:58:05 AM PDT, Petr Jelinek
><petr(dot)jelinek(at)2ndquadrant(dot)com> wrote:
>>> On 27/05/17 17:17, Andres Freund wrote:
>>>>
>>>>
>>>> On May 27, 2017 9:48:22 AM EDT, Petr Jelinek
>>> <petr(dot)jelinek(at)2ndquadrant(dot)com> wrote:
>>>>> Actually, I guess it's the pid 47457 (COPY process) who is
>actually
>>>>> running the xid 73322726. In that case that's the same thing
>>> Masahiko
>>>>> Sawada reported [1]. Which basically is result of snapshot builder
>>>>> waiting for transaction to finish, that's normal if there is a
>long
>>>>> transaction running when the snapshot is being created (and the
>COPY
>>> is
>>>>> a long transaction).
>>>>
>>>> Hm. I suspect the issue is that the exported snapshot needs an xid
>>> for some crosscheck, and that's what we're waiting for. Could you
>>> check what happens if you don't assign one and just content the
>error
>>> checks out? Not at my computer, just theorizing.
>>>>
>>>
>>> I don't think that's it, in my opinion it's the parallelization of
>>> table
>>> data copy where we create snapshot for one process but then the next
>>> one
>>> has to wait for the first one to finish. Before we fixed the
>>> snapshotting, the second one would just use the ondisk snapshot so
>it
>>> would work fine (except the snapshot was corrupted of course). I
>wonder
>>> if we could somehow give it a hint to ignore the read-only txes, but
>>> then we have no way to enforce the txes to stay read-only so it does
>>> not
>>> seem safe.
>>
>> Read-only txs have no xid ...
>>
>
>That's what I mean by hinting, normally they don't but building initial
>snapshot in snapshot builder calls GetTopTransactionId() (see
>SnapBuildInitialSnapshot()) which will assign it xid.

That's precisely what I pointed out a few emails above, and what I suggest changing.

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-05-29 19:24:43 Re: Use of non-restart-safe storage by temp_tablespaces
Previous Message Petr Jelinek 2017-05-29 19:21:50 Re: logical replication busy-waiting on a lock