Re: replication consistency checking

From: Wei Shan <weishan(dot)ang(at)gmail(dot)com>
To: hydra <hydrapolic(at)gmail(dot)com>
Cc: pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: Re: replication consistency checking
Date: 2015-06-10 06:56:47
Message-ID: CAFe9ZTpVf4-1sx4fBPx=O2VPnGyxpNFbv=W3yUxR5ONhOcpJEA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

That's called switchover my friend...

On 10 June 2015 at 02:57, hydra <hydrapolic(at)gmail(dot)com> wrote:

>
>
> On Tue, Jun 9, 2015 at 6:06 AM, Wei Shan <weishan(dot)ang(at)gmail(dot)com> wrote:
>
>> Hi,
>>
>> Usually, for failover, we don't do data consistency check. because it's
>> unplanned failover! In my opinion, it's quite safe to trust
>> pg_stat_replication because it's a binary level replication so it's hard to
>> get row missing unless it's a bug. You got higher chance of getting
>> corruption. However, for switchover, we will do the following;
>>
>> 1. Checking pg_stat_replication views
>> 2. Have a script that do a row count on all the data tables for a period
>> of time. We should see that the 2 database have have very similar row count
>> and maybe a slight delay of 1s?
>>
>> I know that this is not perfect, but at least it's something to start
>> with. The only tool out that that's doing this is Oracle VeriData but I'm
>> not too sure how they are doing this.
>>
>> Cheers.
>>
>>
> Hello,there are situation you wish to do a failover and not just because
> it's unplanned. For example when migrating to a better hardware.
>
>
>
>> On 6 June 2015 at 12:43, hydra <hydrapolic(at)gmail(dot)com> wrote:
>>
>>>
>>>
>>> On Fri, Jun 5, 2015 at 4:56 PM, Scott Ribe <scott_ribe(at)elevated-dev(dot)com>
>>> wrote:
>>>
>>>> On Jun 5, 2015, at 8:42 AM, Igor Neyman <ineyman(at)perceptron(dot)com> wrote:
>>>> >
>>>> > The problem I see with “checksum utility” is that for it to work both
>>>> compared servers should be “static”: not transactions while it does its
>>>> job.
>>>>
>>>> Indeed, and that was brought up before and OP seems to be ignoring it.
>>>> What magic does MySQL (supposedly) use to compare databases without
>>>> interfering with updates?
>>>>
>>>> One could imagine a built-in feature in PG which depends on using MVCC
>>>> and having both sides look at the same snapshot. (Which would require
>>>> repeatable reads.)
>>>>
>>>> But for an external utility, that's quite a bit harder. One suggestion
>>>> which would involve minimal interruption to processing: if you have a
>>>> snapshottable file system, shut down master, take snapshot, shut down
>>>> replica, bring master back up, snapshot replica, bring it back up. You
>>>> *still* have the issue of making sure that at the moment you take master
>>>> down all changes have been streamed to replica, and the (easier) issue of
>>>> making sure replica has applied them before taking its snapshot... But if
>>>> you can manage to pull that off, then you can checksum & compare as much as
>>>> you want, *IF* you can actually pull that off correctly ;-)
>>>>
>>>>
>>>
>>> Hello,
>>> I wasn't talking about a static check, indeed I was referring to an
>>> online tool.
>>>
>>> I haven't read the sources for the MySQL tool yet, however it computes
>>> the checksum on the master, writes the operation to the binary log and
>>> while using statement replication, the slave computes the checksum - then
>>> those can be compared. Not all data in the table are checksummed at once,
>>> but smaller chunks are used instead.
>>>
>>> As Igor mentioned before, that tool can also fail and is not 100%. But I
>>> suppose if it would be misbehaving, chances are the checksums will differ
>>> and you will notice it. It will probably not by accident compute the same
>>> crc/md5 whatever.
>>>
>>> I have done some tests with it, comparing data after setting up
>>> replication, all went fine, then I changed some integer on the slave, it
>>> really computed different checkum as expected. I also did a logical dump
>>> compare, it really seems to be doing its job fine.
>>>
>>>
>>>
>>>> --
>>>> Scott Ribe
>>>> scott_ribe(at)elevated-dev(dot)com
>>>> http://www.elevated-dev.com/
>>>> https://www.linkedin.com/in/scottribe/
>>>> (303) 722-0567 voice
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Regards,
>> Ang Wei Shan
>>
>
>

--
Regards,
Ang Wei Shan

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message neeraj tonu 2015-06-10 10:36:05 Re: barman binary not working properly
Previous Message Jan Lentfer 2015-06-09 20:15:01 Re: replication consistency checking