Re: Streaming Replication clusters and load balancing

From: Dmitry Vasilyev <d(dot)vasilyev(at)postgrespro(dot)ru>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Streaming Replication clusters and load balancing
Date: 2015-09-21 20:59:50
Message-ID: 1442869190.32457.50.camel@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!
By default, HAproxy configuration can not be changed without breaking a
connection with the client :)

------
Dmitry Vasilyev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

On Fri, 2015-09-18 at 12:27 +1000, James Sewell wrote:
> Hello all,
>
> I have recently been working with PostgreSQL and HAProxy to provide
> seamless load balancing to a group of database servers. This on it's
> own isn't a hard thing: I have an implementation finished and am now
> thinking about the best way to bring it to a production ready state
> which could be used by others, and used in load-balancers other than
> HAProxy with minimal config changes.
>
> My initial requirements were:
>
> Given a group of PostgreSQL servers check each x seconds and:
> Allow read/write access only to the master server (via IPA / portA)
> Disallow access if there are multiple master servers
> Allow read access to all servers (via IPB / portB) as long as the
> following holds:
> They are attached to the current master server via streaming
> replication (or they are the current master server)
> They can currently contact the master server (safest option, disallow
> all access when master-less)
> They are in the same timeline as the master server (do I need this
> check?)
> The master server reports that they have less than x bytes lag
> HAProxy can talk to PostgreSQL for a health check via TCP or PSQL
> (connection check only). Neither of these allow the logic above -
> therefore this logic has to be hosted outside of HAProxy. This might
> change in the future if HAProxy gets the ability to send SQL
> statements (like an F5 can).
>
> Today the best way to provide this information to HAProxy (and many
> other load balancers, application frameworks, proxies, monitoring
> solutions) is via HTTP, where HTTP 200 is pass the check and HTTP 503
> is fail the check (and don't load balance to this node). In my case I
> have a script which accepts HTTP requests to /read to check if this
> node is available for read only and /write which checks if this node
> is available for read/writes.
>
> The options as I see them are:
> Implement a script / small app which connects to PostgreSQL and
> executes these checks
> Implemented and proven today at many independent sites
> Should it run on HAProxy server or PSQL server?
> Integrated HTTP server or x.inetd script?
> Platform independence?
> What if it dies?
> Implement a custom PostgreSQL BGworker which provides this
> information over HTTP
> No outside of PostgreSQL config needed
> No reliance on another daemon / interface being up
> libmicrohttpd or similar should help with platform independence
> Security / acceptance by community?
> Only newer versions of PostgreSQL
> Spend the time working on getting SQL checks into HAProxy
> What about other platforms which only support HTTP?
> I think all of the options would benefit from a PSQL extension which
> does the following:
> Encapsulates the check logic (easier to upgrade, manipulate)
> Stores historic check data for a number of hours / days / months
> Stores defaults (override via HTTP could be possible for things like
> lag)
> Does anyone else have any thoughts on this topic?
>
> Eventually many cool features could flow out of this kind of work:
> Integration with High Availability products - I have this working
> with EnterpriseDB EFM now.
> Locate the current master using the HA product
> more than one master doesn't cause loss of service as long as HA
> state is sane
> Locate all clustered standby servers using the HA product
> if a standby is removed from the HA cluster, it is removed from load
> balancing
> if a standby is not part of the cluster, it is removed from load
> balancing (even if it is part of streaming replication)
> HTTP replication status requests which facilitate dynamically
> managing HAProxy (or other) PostgreSQL server pools
> Add a node to streaming replication, it automatically shows up in the
> pool and starts being checked to see if it can service reads
> Great for cloud scale out
> Allocation of additional load balancer groups based on some criteria
> (?), for example
> read/write (as above)
> read only (as above)
> data warehouse (reporting reads only)
> DR (replica with no reads or writes - until it becomes a master)
> Keen to hear comments.
>
> Cheers,
>
> James Sewell,
> Solutions Architect
> ______________________________________
>
>
> Level 2, 50 Queen St, Melbourne VIC 3000
>
> P (+61) 3 8370 8000 W www.lisasoft.com ; F (+61) 3 8370 8099
>
>
> The contents of this email are confidential and may be subject to
> legal or professional privilege and copyright. No representation is
> made that this email is free of viruses or other defects. If you have
> received this communication in error, you may not copy or distribute
> any part of it or otherwise disclose its contents to anyone. Please
> advise the sender of your incorrect receipt of this correspondence.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2015-09-21 21:21:57 Re: jsonb_set array append hack?
Previous Message Alvaro Herrera 2015-09-21 20:19:03 Re: Obsolete use of volatile in walsender.c, walreceiver.c, walreceiverfuncs.c?