Re: 9.6 -> 10.0

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Josh berkus <josh(at)agliodbs(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Devrim Gündüz <devrim(at)gunduz(dot)org>, pgsql-advocacy <pgsql-advocacy(at)postgresql(dot)org>
Subject: Re: 9.6 -> 10.0
Date: 2016-05-12 15:36:55
Message-ID: 20160512153655.GA15098@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-advocacy

On Tue, May 10, 2016 at 05:21:24PM -0400, Bruce Momjian wrote:
> If we are going to focus on update method _change_ rather than just
> upgrade _breakage_, the inclusion of pg_logical in Postgres core would
> be a reason to go to 10.0 because it allows zero-downtime upgrades. I
> think this would be larger upgrade-wise than anything in 9.6.
>
> Currently users are using high-overhead trigger-based replication to
> achieve zero-downtime upgrades, and using streaming replication with
> pg_logical would be a game-changer.

I would like to clarify something I said above.

In a master/slave setup with pg_logical, a major upgrade is _near_
zero-downtime, because you have to switch over all write transactions at
a single point in time when you promote the slave to master. So you
have to either prevent new write transactions from going to the slave
while you wait for the master transactions to finish, or (more likely)
you have to terminate the write transactions on the master and then
promote the slave to master and allow everything to reconnect.

(In practice, you can't change a read/write server to read-only without
a restart, so effectively all old-master transactions have to be drained
at some point.)

Now, when using multi-master, you can cause new write transactions to
start on the new-major-upgraded master while you wait for the
old-major-version master to finish its transactions.

And, this doesn't require two servers. You can run the master/slave or
multi-master on the same server, though this causes double the write
volume and disk space while it is set up. If you set it up only for the
upgrade and during a slow activity period, that might be OK.

All this might have been obvious to people, but I only just now figured
it out. I always thought multi-master was only useful for geo-locating
databases closer to users, but this graceful switch-over is another
advantage of multi-master.

Another issue is that I think BRR might have tarnished the community's
reputation for reliability. I know everyone who downloaded BDR knows it
was an external project, but support is happening using the community
email lists, and the volume of problems reported, and the private
reliability complaints I have heard, make me concerned that people will
think BDR problems show that community Postgres also has
reliability/usability problems, which is bad. I would hate to go
through the same thing with pg_logical.

I know BDR is complex, and I know that people got good support for it on
the community email lists, but we go through a lot of work to have the
server releases be rock solid, and I don't want to have that reputation
tarnished.

I know this all might be academic as the 9.6 beta1 release has been made,
but I wanted to point out the value of multi-master, and explain some of
my concerns more concretely.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +

In response to

Responses

Browse pgsql-advocacy by date

  From Date Subject
Next Message Adrian Klaver 2016-05-12 15:42:48 Re: New versioning scheme
Previous Message Magnus Hagander 2016-05-12 15:30:51 Re: When should be advocate external projects?