Re: replication consistency checking

From: hydra <hydrapolic(at)gmail(dot)com>
To: pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: Re: replication consistency checking
Date: 2015-06-09 18:57:49
Message-ID: CAG6MAzRCA3n3G8f=bAqG6-HweVwFvMpniwfHviOhLDN6jY=hNg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Tue, Jun 9, 2015 at 6:06 AM, Wei Shan <weishan(dot)ang(at)gmail(dot)com> wrote:

> Hi,
>
> Usually, for failover, we don't do data consistency check. because it's
> unplanned failover! In my opinion, it's quite safe to trust
> pg_stat_replication because it's a binary level replication so it's hard to
> get row missing unless it's a bug. You got higher chance of getting
> corruption. However, for switchover, we will do the following;
>
> 1. Checking pg_stat_replication views
> 2. Have a script that do a row count on all the data tables for a period
> of time. We should see that the 2 database have have very similar row count
> and maybe a slight delay of 1s?
>
> I know that this is not perfect, but at least it's something to start
> with. The only tool out that that's doing this is Oracle VeriData but I'm
> not too sure how they are doing this.
>
> Cheers.
>
>
Hello,there are situation you wish to do a failover and not just because
it's unplanned. For example when migrating to a better hardware.

> On 6 June 2015 at 12:43, hydra <hydrapolic(at)gmail(dot)com> wrote:
>
>>
>>
>> On Fri, Jun 5, 2015 at 4:56 PM, Scott Ribe <scott_ribe(at)elevated-dev(dot)com>
>> wrote:
>>
>>> On Jun 5, 2015, at 8:42 AM, Igor Neyman <ineyman(at)perceptron(dot)com> wrote:
>>> >
>>> > The problem I see with “checksum utility” is that for it to work both
>>> compared servers should be “static”: not transactions while it does its
>>> job.
>>>
>>> Indeed, and that was brought up before and OP seems to be ignoring it.
>>> What magic does MySQL (supposedly) use to compare databases without
>>> interfering with updates?
>>>
>>> One could imagine a built-in feature in PG which depends on using MVCC
>>> and having both sides look at the same snapshot. (Which would require
>>> repeatable reads.)
>>>
>>> But for an external utility, that's quite a bit harder. One suggestion
>>> which would involve minimal interruption to processing: if you have a
>>> snapshottable file system, shut down master, take snapshot, shut down
>>> replica, bring master back up, snapshot replica, bring it back up. You
>>> *still* have the issue of making sure that at the moment you take master
>>> down all changes have been streamed to replica, and the (easier) issue of
>>> making sure replica has applied them before taking its snapshot... But if
>>> you can manage to pull that off, then you can checksum & compare as much as
>>> you want, *IF* you can actually pull that off correctly ;-)
>>>
>>>
>>
>> Hello,
>> I wasn't talking about a static check, indeed I was referring to an
>> online tool.
>>
>> I haven't read the sources for the MySQL tool yet, however it computes
>> the checksum on the master, writes the operation to the binary log and
>> while using statement replication, the slave computes the checksum - then
>> those can be compared. Not all data in the table are checksummed at once,
>> but smaller chunks are used instead.
>>
>> As Igor mentioned before, that tool can also fail and is not 100%. But I
>> suppose if it would be misbehaving, chances are the checksums will differ
>> and you will notice it. It will probably not by accident compute the same
>> crc/md5 whatever.
>>
>> I have done some tests with it, comparing data after setting up
>> replication, all went fine, then I changed some integer on the slave, it
>> really computed different checkum as expected. I also did a logical dump
>> compare, it really seems to be doing its job fine.
>>
>>
>>
>>> --
>>> Scott Ribe
>>> scott_ribe(at)elevated-dev(dot)com
>>> http://www.elevated-dev.com/
>>> https://www.linkedin.com/in/scottribe/
>>> (303) 722-0567 voice
>>>
>>>
>>>
>>>
>>>
>>>
>>
>
>
> --
> Regards,
> Ang Wei Shan
>

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Jan Lentfer 2015-06-09 19:20:53 Re: replication consistency checking
Previous Message Jan Lentfer 2015-06-09 08:47:16 Re: cache Memory of server