From: | "Jehan-Guillaume (ioguix) de Rorthais" <ioguix(at)free(dot)fr> |
---|---|
To: | Vikas Sharma <shavikas(at)gmail(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Postgresql Split Brain: Which one is latest |
Date: | 2018-04-10 19:19:30 |
Message-ID: | 20180410211930.10fa058f@firost |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Tue, 10 Apr 2018 17:02:39 +0000
Vikas Sharma <shavikas(at)gmail(dot)com> wrote:
> Max count is one way (vague I agree), before confirming I will ask the
> application owner to have a look on data in tables as well.
Maybe you could compare your tables on both sides using a tool like
pg_comparator? See:
https://cri.ensmp.fr/people/coelho/pg_comparator/pg_comparator.html
By the way, what are you using for your auto-failover? What went wrong to
end-up with a split brain situation?
Regards,
> On Tue, Apr 10, 2018, 17:55 Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> wrote:
>
> > On 04/10/2018 09:47 AM, Vikas Sharma wrote:
> > > Thanks Adrian and Edison, I also think so. At the moment I have 2
> > > masters, as soon as slave is promoted to master it starts its own
> > > timeline and application might have added data to either of them or
> > > both, only way to find out correct master now is the instance with max
> > > count of data in tables which could incur data loss as well. Correct me
> > > if wrong please?
> >
> > Not sure max count is necessarily a valid indicator:
> >
> > 1) What if there was a legitimate large delete process?
> >
> > 2) The application/end users where looking at two different views of the
> > data at different points in time. Just because the count is higher does
> > not mean the data is actually valid.
> >
> > >
> > > Thanks and Regards
> > > Vikas
> > >
> > > On Tue, Apr 10, 2018, 17:29 Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com
> > > <mailto:adrian(dot)klaver(at)aklaver(dot)com>> wrote:
> > >
> > > On 04/10/2018 08:04 AM, Vikas Sharma wrote:
> > > > Hi Adrian,
> > > >
> > > > This can be a good example: Application server e.g. tomcat having
> > two
> > > > entries to connect to databases, one for master and 2nd for Slave
> > > > (ideally used when slave becomes master). If application is not
> > > able to
> > > > connect to first, it will try to connect to 2nd.
> > >
> > > So the application server had a way of seeing the new master(old
> > slave),
> > > in spite of the network glitch, that the original master database
> > > did not?
> > >
> > > If so and it was distributing data between the two masters on an
> > unknown
> > > schedule, then as Edison pointed out in another post, you really
> > have a
> > > split brain issue. Each master would have it's own view of the data
> > and
> > > latest update would really only be relevant for that master.
> > >
> > > >
> > > > Regards
> > > > Vikas
> > > >
> > > > On 10 April 2018 at 15:26, Adrian Klaver
> > > <adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com>
> > > > <mailto:adrian(dot)klaver(at)aklaver(dot)com
> > > <mailto:adrian(dot)klaver(at)aklaver(dot)com>>> wrote:
> > > >
> > > > On 04/10/2018 06:50 AM, Vikas Sharma wrote:
> > > >
> > > > Hi,
> > > >
> > > > We have postgresql 9.5 with streaming
> > > replication(Master-slave)
> > > > and automatic failover. Due to network glitch we are in
> > > > master-master situation for quite some time. Please,
> > > could you
> > > > advise best way to confirm which node is latest in terms
> > of
> > > > updates to the postgres databases.
> > > >
> > > >
> > > > It might help to know how the two masters received data when
> > they
> > > > where operating independently.
> > > >
> > > >
> > > > Regards
> > > > Vikas Sharma
> > > >
> > > >
> > > >
> > > > --
> > > > Adrian Klaver
> > > > adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com>
> > > <mailto:adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com
> > >>
> > > >
> > > >
> > >
> > >
> > > --
> > > Adrian Klaver
> > > adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com>
> > >
> >
> >
> > --
> > Adrian Klaver
> > adrian(dot)klaver(at)aklaver(dot)com
> >
--
Jehan-Guillaume de Rorthais
Dalibo
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2018-04-10 22:09:21 | Re: ERROR: found multixact from before relminmxid |
Previous Message | Jerry Sievers | 2018-04-10 18:09:52 | Re: best way to write large data-streams quickly? |