From: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
---|---|
To: | Bob Hatfield <bobhatfield(at)gmail(dot)com> |
Cc: | pgsql-general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: pg 8.3 replication causing corruption |
Date: | 2011-10-13 21:20:07 |
Message-ID: | CAHyXU0yBBQyyvF6Vx-9Mci7JkJhtU41Gg4P7LP_RSF=0VBS+UQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Thu, Oct 13, 2011 at 4:07 PM, Bob Hatfield <bobhatfield(at)gmail(dot)com> wrote:
>> have you had any power events? hard shutdowns, etc? I wonder if the problem is in the clog files, and not the heap itself.
>
> Nothing unusual for as long as I can tell. Reminder that as long as I
> don't restart the primary's pg process, everything works fine
> (secondary's data is intact).
>
> It's as if stopping/starting the primary causes a shipped wal file to
> be corrupt or contain duplicated data then processed by the secondary.
My money is on clog/visibility related issues. It's a bit of a bear,
but can you pull the xmin/xmax/ctid for the two duplicate records on
the standby and the correspondingly non-duplicated record on the
master? I'm curious if the heap blocks are identical and if the
standby is incorrectly marking a transaction as valid/invalid.
>From there,
We need to:
*) figure out the transaction bits in clog on both systems and look
them up there.
*) also, look for differences in clog generally
*) digest the heap block containing the records to see if they are identical
*) double check hint bits?
merlin
From | Date | Subject | |
---|---|---|---|
Next Message | David Johnston | 2011-10-13 21:45:24 | Re: Test for cascade delete in plpgsql |
Previous Message | Bob Hatfield | 2011-10-13 21:14:29 | Re: Are file system level differential/incremental backups possible? |