From: | Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | Erik Rijkers <er(at)xs4all(dot)nl>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Logical replication - TRAP: FailedAssertion in pgstat.c |
Date: | 2017-05-08 16:26:36 |
Message-ID: | 5f656818-4800-62e7-7c28-b2f1b584e2cb@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 08/05/17 17:52, Masahiko Sawada wrote:
> On Fri, May 5, 2017 at 8:13 PM, Petr Jelinek
> <petr(dot)jelinek(at)2ndquadrant(dot)com> wrote:
>> On 03/05/17 13:23, Erik Rijkers wrote:
>>> On 2017-05-03 08:17, Petr Jelinek wrote:
>>>> On 02/05/17 20:43, Robert Haas wrote:
>>>>> On Thu, Apr 20, 2017 at 2:58 PM, Peter Eisentraut
>>>
>>>>>> code path that calls CommitTransactionCommand() should have one, no?
>>>>>
>>>>> Is there anything left to be committed here?
>>>>>
>>>>
>>>> Afaics the fix was not committed. Peter wanted more comprehensive fix
>>>> which didn't happen. I think something like attached should do the job.
>>>
>>> I'm running my pgbench-over-logical-replication test in chunk of 15
>>> minutes, wth different pgbench -c (num clients) and -s (scale) values.
>>>
>>> With this patch (and nothing else) on top of master (8f8b9be51fd7 to be
>>> precise):
>>>
>>>> fix-statistics-reporting-in-logical-replication-work.patch
>>>
>>> logical replication is still often failing (as expected, I suppose; it
>>> seems because of "inital snapshot too large") but indeed I do not see
>>
>> Yes that's different thing that we've been discussing a bit in snapbuild
>> woes thread.
>>
>>> the 'TRAP: FailedAssertion in pgstat.c' anymore.
>>>
>>> (If there is any other configuration of patches worth testing please let
>>> me know)
>>>
>>
>> Thanks, so the patch works.
>>
>
> I think that we should commit the local transaction that did initial
> data copy, and then report stat as well. Currently table sync worker
> doesn't commit the local transaction in LogicalRepSyncTableStart
> (maybe until apply commit record?) if its status is changed to
> SUBREL_STATE_CATCHUP. That's why the table sync worker issues
> assertion failure.
>
That would fix the assert as well yes, but it would also mean that if
the worker crashed between the initial copy and the end of catchup there
would be no way to restart it without manual intervention from user
since the synchronization position would be lost. Hence the fix I
proposed which does it differently and has the whole sync in a single
transaction.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Petr Jelinek | 2017-05-08 16:27:22 | Re: Logical replication - TRAP: FailedAssertion in pgstat.c |
Previous Message | Robert Haas | 2017-05-08 16:26:21 | Re: Adding support for Default partition in partitioning |