From: | Erik Rijkers <er(at)xs4all(dot)nl> |
---|---|
To: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
Cc: | PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: TRAP: FailedAssertion("!(TransactionIdPrecedesOrEquals |
Date: | 2017-12-20 06:33:45 |
Message-ID: | f4bc19a726ac9fce7e47c69ddf018cbc@xs4all.nl |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2017-12-20 06:27, Michael Paquier wrote:
> On Wed, Dec 20, 2017 at 7:46 AM, Erik Rijkers <er(at)xs4all(dot)nl> wrote:
TRAP: FailedAssertion("!(TransactionIdPrecedesOrEquals(safeXid,
snap->xmin))", File: "snapbuild.c", Line: 580)
>> Sorry, that was probably too terse, I should explain that a little.
>>
>> After initing 50 instances, I set up and run a pgbench session in the
>> master
>> session; the pgbench lines are:
>>
>> init: pgbench --port=6515 --quiet --initialize --scale=1 postgres
>> run: pgbench -M prepared -c 16 -j 8 -T 1 -P 1 -n postgres -- scale
>> 1
>>
>> the other instances then catch up. The whole takes 5 minutes or so
>>
>> I vary scale, duration, and number of instances. I haven't had it
>> fail in
>> this way yet but I mostly tried with lower number of instances (up to
>> 25 or
>> so).
>
> Hm. Are you saying that it takes at least 50 cascading instances to
> see the problem you are seeing? And that you are not seeing any
> problems with a lower number of cascading instances? Are you enabling
> hot_standby_feedback?
That sounds more definitive than I meant it, but yes, only now that I
tried a higher number of instances did I see this. But is also often
succeeds at up to 100 instances (100 is the highest I have tried).
These 50 instances were a logical replication chain, and
hot_standby_feedback is off.
Overnight I ran 80x the test that failed yesterday: now they all 80
succeeded. I am not sure what causes failure over success.
(logical replication does the initial syncing of the instances one by
one (sequentially) so it isn't as busy as expected; it just takes a long
time)
I wrote a simple perl program to test logical replication (attached,
FWIW), running:
./cascade.pl --instances=50 --scale=1 --clients=16 --threads=8
--duration=1 --repeats=3 --waiting=10
This cascade.pl program uses knowledge of my setup so probably won't run
elsewhere as is but it shows how the failing test was done.
Erik
Attachment | Content-Type | Size |
---|---|---|
cascade.pl | text/x-perl | 26.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2017-12-20 06:38:56 | Re: TRAP: FailedAssertion("!(TransactionIdPrecedesOrEquals |
Previous Message | Amit Khandekar | 2017-12-20 06:22:38 | Re: [HACKERS] UPDATE of partition key |