Re: PostgreSQL HA config recommendations

From: Fabio Ugo Venchiarutti <fabio(at)vuole(dot)me>
To: William Dunn <dunnwjr(at)gmail(dot)com>
Cc: Alex Gregory <alex(at)c2company(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: PostgreSQL HA config recommendations
Date: 2015-05-01 01:03:03
Message-ID: 5542D0C7.2020003@vuole.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Point taken. William is right.

My recommendations were unusually pessimistic as I didn't take enough
time to assess global+instantaneous data changes visibility requirements
on your part.

If the cluster is ONLY HA and you don't need to read fresh data off
secondary nodes (E.G.: HA+read load balancing), asynchronous is good
enough in most cases.

On 01/05/15 06:37, William Dunn wrote:
> Alex,
> Note that you should be weary of suggestions to make your replication
> synchronous. Synchronous replication is rarely used for this kind of use
> case (Cisco Jabber) where the most complete durability of the standby is
> not of the utmost concern (as it would be in a banking application). Not
> only will it decrease performance, but since you expect to have only one
> local standby it could actually decrease your availability because if
> your standby went down no transactions would be able to commit on the
> master. See the Synchronous Replication section of the docs for more
> details (http://www.postgresql.org/docs/devel/static/warm-standby.html)
>
> Also note that the suggestion provided by Fabio that you should not have
> your application commit more than one transaction per user operation is
> only applicable in synchronous replication (though since this is for a
> Cisco Jabber, where you neither have control over nor much information
> regarding the number of commits sent by the transaction per user
> operation, that suggestion is not applicable anyway...). In the case of
> asynchronous master-slave replication the typical issue with streaming
> replication latency is that you have transactions going to the master
> and then the application sends a read only transaction to the slave
> before the slave receives the transaction. So long as you don't have the
> application consider the user operation completed before all the
> transactions are committed I don't think having multiple transactions
> would make your replication latency issue any less.
>
> For example, if you had a calendar application where a user enters
> event details and creates an event for the calendar. The application
> may be set up to execute 2 transactions, 1) Add the event and
> details to the calendar events table and 2) once the event creation
> transaction returns add the current user as an attendee for that
> event. In this case both transactions would be going against the
> master, so how far the slave is behind wouldn't be a factor. Of
> course it would be faster overall to send the inserts as a single
> database procedure, but that all goes against the master database so
> the streaming replication is not a factor in that consideration.
>
>
> *William J. Dunn*
> _willjdunn.com <http://willjdunn.com>_
>
> *William J. Dunn*
> *P* 978-844-4427 | _dunnwjr(at)gmail(dot)com <mailto:dunnwjr(at)gmail(dot)com>_
> _dunnw(at)bu(dot)edu <mailto:dunnw(at)bu(dot)edu>_
>
> On Thu, Apr 30, 2015 at 9:02 AM, Fabio Ugo Venchiarutti <fabio(at)vuole(dot)me
> <mailto:fabio(at)vuole(dot)me>> wrote:
>
> > WAN delays can cause problems for any replication system; you just have
> > to be aware of that and not push things too hard (or try and violate the
> > laws of physics). For example, streaming replication set to be
> > synchronous crossing the planet is something you'd probably be rather
> > unhappy with. :)
>
>
> In my experience streaming replication fits most use cases due to
> inherent its simplicity and robustness, but you might need to adjust
> your software design to get the best out of it.
>
>
> More specifically, latency issues can be heavily mitigated by having
> application software commit no more than one transaction per user
> operation, provided 1 x "master<->sync_slave round trip time" is
> acceptable delay when they submit forms or the like.
>
> It can get much worse if the application server is on a different
> geographical node than the DB master. In such case it is
> realistically beneficial to batch multiple write operations in a
> single STATEMENT instead.
> If the replication synchronous slave is on yet another node, the
> best case (single statement) scenario would be 2 x round trip time.
> This configuration is more common than you might think as some
> setups feature remote app servers reading off synchronous slaves at
> their own physical location but committing against a master that is
> somewhere else.
>
>
> Cheers
>
>
>
>
>
>
>
> On 30/04/15 11:06, Jim Nasby wrote:
>
> On 4/29/15 1:13 PM, Alex Gregory wrote:
>
> I was thinking that I could use Slony but then I read that
> it does not
> like WAN replication. I have also read about streaming
> replication
> native to Postgres but was not sure how that would work over
> the WAN.
> Bucardo seems better for Data Warehousing or multimaster
> situations
> which this is not. That leaves pgpool ii which seems like
> it would
> add an extra layer of complexity.
>
>
> WAN delays can cause problems for any replication system; you
> just have
> to be aware of that and not push things too hard (or try and
> violate the
> laws of physics). For example, streaming replication set to be
> synchronous crossing the planet is something you'd probably be
> rather
> unhappy with. :)
>
> I haven't played with Slony in forever, but when I did it loved
> to lock
> things. That would not play well with high latency.
>
> I have run londiste between sites within the same city, and that
> worked
> well.
>
> Bucardo and pg_pool are both based on the idea of replaying SQL
> statements instead of replicating actual data. They have their
> uses, but
> I personally distrust that idea, especially for DR.
>
> When it comes down to to there are so many choices I am not
> sure if I
> need one or a combination of two. Any help you could
> provide could
> be greatly appreciated.
>
>
> If you want to replicate within a data center then streaming
> replication
> is pretty nice, and as a bonus you might be able to do
> synchronous as
> well. The downside to streaming rep is that it's binary, so if
> you ever
> suffer data corruption you're practically guaranteed that corruption
> will end up on the replica. Logical replication like londiste or
> Slony
> are much more robust against that. You also can't use temporary
> tables
> with streaming rep, and you have to replicate the details of ALL
> activity, including maintenance like VACUUM. In some
> environments that
> might be slower than logical replication.
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org
> <mailto:pgsql-general(at)postgresql(dot)org>)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Mathew Moon 2015-05-01 04:40:23 Removing and readding bdr nodes
Previous Message Adrian Klaver 2015-05-01 00:36:59 Re: psql 8 warm standby strong start, weak finish