Re: Postgresql Split Brain: Which one is latest

From: "Jehan-Guillaume (ioguix) de Rorthais" <ioguix(at)free(dot)fr>
To: Vikas Sharma <shavikas(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Postgresql Split Brain: Which one is latest
Date: 2018-04-10 19:19:30
Message-ID: 20180410211930.10fa058f@firost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, 10 Apr 2018 17:02:39 +0000
Vikas Sharma <shavikas(at)gmail(dot)com> wrote:

> Max count is one way (vague I agree), before confirming I will ask the
> application owner to have a look on data in tables as well.

Maybe you could compare your tables on both sides using a tool like
pg_comparator? See:

https://cri.ensmp.fr/people/coelho/pg_comparator/pg_comparator.html

By the way, what are you using for your auto-failover? What went wrong to
end-up with a split brain situation?

Regards,

> On Tue, Apr 10, 2018, 17:55 Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> wrote:
>
> > On 04/10/2018 09:47 AM, Vikas Sharma wrote:
> > > Thanks Adrian and Edison, I also think so. At the moment I have 2
> > > masters, as soon as slave is promoted to master it starts its own
> > > timeline and application might have added data to either of them or
> > > both, only way to find out correct master now is the instance with max
> > > count of data in tables which could incur data loss as well. Correct me
> > > if wrong please?
> >
> > Not sure max count is necessarily a valid indicator:
> >
> > 1) What if there was a legitimate large delete process?
> >
> > 2) The application/end users where looking at two different views of the
> > data at different points in time. Just because the count is higher does
> > not mean the data is actually valid.
> >
> > >
> > > Thanks and Regards
> > > Vikas
> > >
> > > On Tue, Apr 10, 2018, 17:29 Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com
> > > <mailto:adrian(dot)klaver(at)aklaver(dot)com>> wrote:
> > >
> > > On 04/10/2018 08:04 AM, Vikas Sharma wrote:
> > > > Hi Adrian,
> > > >
> > > > This can be a good example: Application server e.g. tomcat having
> > two
> > > > entries to connect to databases, one for master and 2nd for Slave
> > > > (ideally used when slave becomes master). If application is not
> > > able to
> > > > connect to first, it will try to connect to 2nd.
> > >
> > > So the application server had a way of seeing the new master(old
> > slave),
> > > in spite of the network glitch, that the original master database
> > > did not?
> > >
> > > If so and it was distributing data between the two masters on an
> > unknown
> > > schedule, then as Edison pointed out in another post, you really
> > have a
> > > split brain issue. Each master would have it's own view of the data
> > and
> > > latest update would really only be relevant for that master.
> > >
> > > >
> > > > Regards
> > > > Vikas
> > > >
> > > > On 10 April 2018 at 15:26, Adrian Klaver
> > > <adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com>
> > > > <mailto:adrian(dot)klaver(at)aklaver(dot)com
> > > <mailto:adrian(dot)klaver(at)aklaver(dot)com>>> wrote:
> > > >
> > > > On 04/10/2018 06:50 AM, Vikas Sharma wrote:
> > > >
> > > > Hi,
> > > >
> > > > We have postgresql 9.5 with streaming
> > > replication(Master-slave)
> > > > and automatic failover. Due to network glitch we are in
> > > > master-master situation for quite some time. Please,
> > > could you
> > > > advise best way to confirm which node is latest in terms
> > of
> > > > updates to the postgres databases.
> > > >
> > > >
> > > > It might help to know how the two masters received data when
> > they
> > > > where operating independently.
> > > >
> > > >
> > > > Regards
> > > > Vikas Sharma
> > > >
> > > >
> > > >
> > > > --
> > > > Adrian Klaver
> > > > adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com>
> > > <mailto:adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com
> > >>
> > > >
> > > >
> > >
> > >
> > > --
> > > Adrian Klaver
> > > adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com>
> > >
> >
> >
> > --
> > Adrian Klaver
> > adrian(dot)klaver(at)aklaver(dot)com
> >

--
Jehan-Guillaume de Rorthais
Dalibo

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Peter Geoghegan 2018-04-10 22:09:21 Re: ERROR: found multixact from before relminmxid
Previous Message Jerry Sievers 2018-04-10 18:09:52 Re: best way to write large data-streams quickly?