From: | Erik Rijkers <er(at)xs4all(dot)nl> |
---|---|
To: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, pgsql-hackers-owner(at)postgresql(dot)org |
Subject: | Re: logical replication - still unstable after all these months |
Date: | 2017-05-26 07:27:16 |
Message-ID: | 2248d971c274c30615254594f5c2dbf0@xs4all.nl |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2017-05-26 08:58, Simon Riggs wrote:
> On 26 May 2017 at 07:10, Erik Rijkers <er(at)xs4all(dot)nl> wrote:
>
>> - Do you agree this number of failures is far too high?
>> - Am I the only one finding so many failures?
>
> What type of failure are you getting?
The failure is that in the result state the replicated tables differ
from the original tables.
For instance,
-- out_20170525_0944.txt
100 -- pgbench -c 90 -j 8 -T 60 -P 12 -n -- scale 25
93 -- All is well.
7 -- Not good.
These numbers mean: the result state of primary and replica is not the
same, in 7 out of 100 runs.
'not the same state' means: at least one of the 4 md5's of the sorted
content of the 4 pgbench tables on the primary is different from those
taken from the replica.
So, 'failure' means: the 4 pgbench tables on primary and replica are not
exactly the same after the (one-minute) pgbench-run has finished, and
logical replication has 'finished'. (plenty of time is given for the
replica to catchup. The test only calls 'failure' after 20x waiting (for
15 seconds) and 20x finding the same erroneous state (erroneous because
not-same as on primary).
I would really like to know it you think that that doesn't amount to
'failure'.
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2017-05-26 07:40:36 | Re: logical replication - still unstable after all these months |
Previous Message | Simon Riggs | 2017-05-26 06:58:52 | Re: logical replication - still unstable after all these months |