From: | Josh Berkus <josh(at)agliodbs(dot)com> |
---|---|
To: | Andres Freund <andres(at)2ndquadrant(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: New and interesting replication issues with 9.2.8 sync rep |
Date: | 2014-05-05 17:16:27 |
Message-ID: | 5367C76B.5010504@agliodbs.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 05/03/2014 01:07 AM, Andres Freund wrote:
> On 2014-05-02 18:57:08 -0700, Josh Berkus wrote:
>> Just got a report of a replication issue with 9.2.8 from a community member:
>>
>> Here's the sequence:
>>
>> 1) A --> B (sync rep)
>>
>> 2) Shut down B
>>
>> 3) Shut down A
>>
>> 4) Start up B as a master
>>
>> 5) Start up A as sync replica of B
>>
>> 6) A successfully joins B as a sync replica, even though its transaction
>> log is 1016 bytes *ahead* of B.
>>
>> 7) Transactions written to B all hang
>>
>> 8) Xlog on A is now corrupt, although the database itself is OK
>
> This is fundamentally borked practice.
>
>> Now, the above sequence happened because of the user misunderstanding
>> what sync rep really means. However, A should not have been able to
>> connect with B in replication mode, especially in sync rep mode; that
>> should have failed. Any thoughts on why it didn't?
>
> I'd guess that B, while starting up, has written further WAL records
> bringing it further ahead of A.
Apparently not; from what I've seen pg_stat_replication even *shows*
that the replica is ahead of the master. Futher, Postgres should have
recognized that there was a timeline branch point before A's last
record, no?
I'm working on getting permission to access the DB files.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2014-05-05 17:22:47 | Re: Cluster name in ps output |
Previous Message | Jim Nasby | 2014-05-05 17:11:38 | Re: regexp_replace( , , , NULL ) returns null? |