From: | Hannu Krosing <hannu(at)2ndQuadrant(dot)com> |
---|---|
To: | Florian Pflug <fgp(at)phlo(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com> |
Cc: | Bruce Momjian <bruce(at)momjian(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, MauMau <maumau307(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Rajeev rastogi <rajeev(dot)rastogi(at)huawei(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Standalone synchronous master |
Date: | 2014-01-13 18:12:55 |
Message-ID: | 52D42CA7.90302@2ndQuadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 01/13/2014 04:12 PM, Florian Pflug wrote:
> On Jan12, 2014, at 04:18 , Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>> Thing is, when we talk about auto-degrade, we need to determine things
>> like "Is the replica down or is this just a network blip"? and take
>> action according to the user's desired configuration. This is not
>> something, realistically, that we can do on a single request. Whereas
>> it would be fairly simple for an external monitoring utility to do:
>>
>> 1. decide replica is offline for the duration (several poll attempts
>> have failed)
>>
>> 2. Send ALTER SYSTEM SET to the master and change/disable the
>> synch_replicas.
>>
>> In other words, if we're going to have auto-degrade, the most
>> intelligent place for it is in
>> RepMgr/HandyRep/OmniPITR/pgPoolII/whatever. It's also the *easiest*
>> place. Anything we do *inside* Postgres is going to have a really,
>> really hard time determining when to degrade.
> +1
>
> This is also how 2PC works, btw - the database provides the building
> blocks, i.e. PREPARE and COMMIT, and leaves it to a transaction manager
> to deal with issues that require a whole-cluster perspective.
>
++1
I like Simons idea to have a pg_xxx function for switching between
replication modes, which should be enough to support a monitor
daemon doing the switching.
Maybe we could have an 'syncrep_taking_too_long_command' GUC
which could be used to alert such a monitoring daemon, so it can
immediately check weather to
a) switch master to async rep or standalone mode (in case of sync slave
becoming unavailable)
or
b) to failover to slave (in almost equally likely case that it was the
master
which became disconnected from the world and slave is available)
or
c) do something else depending on circumstances/policy :)
NB! Note that in case of b) 'syncrep_taking_too_long_command' will
very likely also not reach the monitor daemon, so it can not relay on
this as main trigger!
Cheers
--
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ
From | Date | Subject | |
---|---|---|---|
Next Message | Joshua D. Drake | 2014-01-13 18:21:47 | Re: Standalone synchronous master |
Previous Message | Heikki Linnakangas | 2014-01-13 18:06:56 | Re: Patch: show xid and xmin in pg_stat_activity and pg_stat_replication |