Re: Proposal: Implement failover on libpq connect level.

From: Christopher Browne <cbbrowne(at)gmail(dot)com>
To: "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: Implement failover on libpq connect level.
Date: 2015-09-02 16:45:26
Message-ID: CAFNqd5Xu=DZ3b7CJ68Wf0m+QWfUWcwkvPbypj2kMfbb1oUT5tg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-jdbc

On 2 September 2015 at 04:52, Shulgin, Oleksandr <
oleksandr(dot)shulgin(at)zalando(dot)de> wrote:
>
> On Tue, Sep 1, 2015 at 8:12 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>>
>> On 2015-09-01 14:07:19 -0400, Robert Haas wrote:
>> > But I think it's quite wrong to assume that the infrastructure for
>> > this is available and usable everywhere, because in my experience,
>> > that's far from the case.
>>
>> Especially when the alternative is a rather short patch implementing an
>> otherwise widely available feature.
>
>
> But that won't actually help in the case described by Robert: if the
master server A failed, the client has no idea if B or C would become the
new master.
>
> Unless it actually tries to connect them in turn and check for the result
of pg_is_in_recovery(). I think that brings enough complexity for keeping
this outside of libpq. Also think about all the extra flexibility people
will likely want to have: number of retries, delay between retries, delay
backoff, etc., to the point we'll have to support some sort of client code
retry_policy_callback.
>
> For read-only clients you might want to include a number of slave
hostnames, and let the connector choose one, but then again you can't
achieve load-balancing on the client side, you're better off using
round-robin DNS. To add or remove a slave you only need to update DNS, and
not configuration on all the clients.
>
> For the master failover I think a common technique is to just move the
floating IP address from the old master to the new one. This doesn't
require touching the DNS record.

It seems to me that any time the answer involves touching the DNS record,
then having this be libpq configuration is terribly inappropriate.

I'll note that OpenLDAP tends to accept multiple database URIs, and is
nonspecific as to the relative semantics. That points me at two things
that may be worth considering for libpq:

1. It seems to me that it would be ridiculous to try to have semantics
trying to interpret multiple values passed to
PGPORT/PGHOST/PGDATABASE/-p/-h/-d/, but rather, that if there are to be
multiple values, they should be presented as URIs.

I'm finding, by the way, that it is very useful in simplifying application
configuration to use the URIs that have been usable since about 9.2.

I <3
postgresql://postgres(at)localhost:7093/myfavedatabase

2. I'd be inclined to go with pretty simple functionality in anything that
would be built-in to libpq. No great amount of logic, rather

a) Try the first URI. If that works (where "connects" = "works"), we're
done.
b) Try the next URI. If that works, we're done.
c) Is there another one? Try it... Works... Done. Repeat as needful.

There's all kinds of functionality that you could add, between...
- Random selection of URI.
- Running some query against each URI tried to see if it's a
"master/slave", changing client functionality based on that.
- Picking URI based on connecting (somewhere) and deciding what URI to
prefer.

I'd suggest implementing the simplest thing (e.g. - try URI 1, then 2, then
...), as that's most likely the most useful.
And more sophisticated functionality probably ought to be in the
application anyways.
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-09-02 16:45:56 Re: Allow a per-tablespace effective_io_concurrency setting
Previous Message Andres Freund 2015-09-02 16:44:40 Re: [PATCH] Microvacuum for gist.

Browse pgsql-jdbc by date

  From Date Subject
Next Message Robert Haas 2015-09-02 19:00:45 Re: Proposal: Implement failover on libpq connect level.
Previous Message Shulgin, Oleksandr 2015-09-02 08:52:45 Re: Proposal: Implement failover on libpq connect level.