| From: | Erik Rijkers <er(at)xs4all(dot)nl> | 
|---|---|
| To: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> | 
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions | 
| Date: | 2017-12-24 09:00:00 | 
| Message-ID: | 84b7076830fbedc155670b859926e99e@xs4all.nl | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
>>>> 
>>>> logical replication of 2 instances is OK but 3 and up fail with:
>>>> 
>>>> TRAP: FailedAssertion("!(last_lsn < change->lsn)", File:
>>>> "reorderbuffer.c", Line: 1773)
>>>> 
>>>> I can cobble up a script but I hope you have enough from the 
>>>> assertion
>>>> to see what's going wrong...
>>> 
>>> The assertion says that the iterator produces changes in order that 
>>> does
>>> not correlate with LSN. But I have a hard time understanding how that
>>> could happen, particularly because according to the line number this
>>> happens in ReorderBufferCommit(), i.e. the current (non-streaming) 
>>> case.
>>> 
>>> So instructions to reproduce the issue would be very helpful.
>> 
>> Using:
>> 
>> 0001-Introduce-logical_work_mem-to-limit-ReorderBuffer-v2.patch
>> 0002-Issue-XLOG_XACT_ASSIGNMENT-with-wal_level-logical-v2.patch
>> 0003-Issue-individual-invalidations-with-wal_level-log-v2.patch
>> 0004-Extend-the-output-plugin-API-with-stream-methods-v2.patch
>> 0005-Implement-streaming-mode-in-ReorderBuffer-v2.patch
>> 0006-Add-support-for-streaming-to-built-in-replication-v2.patch
>> 
>> As you expected the problem is the same with these new patches.
>> 
>> I have now tested more, and seen that it not always fails.  I guess 
>> that
>> it here fails 3 times out of 4.  But the laptop I'm using at the 
>> moment
>> is old and slow -- it may well be a factor as we've seen before [1].
>> 
>> Attached is the bash that I put together.  I tested with
>> NUM_INSTANCES=2, which yields success, and NUM_INSTANCES=3, which 
>> fails
>> often.  This same program run with HEAD never seems to fail (I tried a
>> few dozen times).
>> 
> 
> Thanks. Unfortunately I still can't reproduce the issue. I even tried
> running it in valgrind, to see if there are some memory access issues
> (which should also slow it down significantly).
One wonders again if 2ndquadrant shouldn't invest in some old hardware 
;)
Another Good Thing would be if there was a provision in the buildfarm to 
test patches like these.
But I'm probably not to first one to suggest that; no doubt it'll be 
possible someday.  In the meantime I'll try to repeat this crash on 
other machines (but that will be after the holidays).
Erik Rijkers
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tomas Vondra | 2017-12-24 13:43:49 | Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions | 
| Previous Message | Fabien COELHO | 2017-12-24 08:12:27 | Re: General purpose hashing func in pgbench |