Quick Links

Re: synchronous_commit = remote_flush

From:	Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To:	Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc:	Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: synchronous_commit = remote_flush
Date:	2016-08-21 22:08:52
Message-ID:	CAEepm=1EbM7P4YUg0rPB6h1qS8gC2pT2+WpUO-L7MUg5w+gWCw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, Aug 19, 2016 at 6:30 AM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:
> I'm wondering if we've hit the point where trying to put all of this in a
> single GUC is a bad idea... changing that probably means a config
> compatibility break, but I don't think that's necessarily a bad thing at
> this point...

Aside from the (IMHO) slightly confusing way that "on" works, which is
the smaller issue I was raising in this thread, I agree that we might
eventually want to escape from the assumption that "local apply" (=
off), local flush, remote write, remote flush, remote apply happen in
that order and therefore a single linear control knob can describe
which of those to wait for.

Some pie-in-the-sky thoughts: we currently can't reach
"group-safe"[1], where you wait only for N servers to have the WAL in
memory (let's say that for us that means write but not flush): the
closest we can get is "1-safe and group-safe", using remote_write to
wait for the standbys to write (= "group-safe"), which implies local
flush (= "1-safe"). Now that'd be a terrible level to use unless your
recovery procedure included cluster-wide communication to straighten
things out, and without any such clusterware it makes a lot of sense
to have the master flush before sending, and I'm not actually
proposing we change that, I'm just speculating that someone might
eventually want it. We also can't have standbys apply before they
flush; as far as I know there is no theoretical reason why that
shouldn't be allowed, except maybe for some special synchronisation
steps around checkpoint records so that recovery doesn't get too far
ahead. That'd mirror what happens on the master more closely.
Imagine if you wanted to wait for your transaction to become visible
on certain other servers, but didn't want to wait for any disks:
that'd be the distributed equivalent of today's "off", but today's
"remote_apply" implies local flush and remote flush. Or more likely
you'd want some combination: 2-safe or group-safe on some subset of
servers to satisfy your durability requirements, and applied on some
other perhaps larger subset of servers for consistency. But this is
just water cooler handwaving.

[1] https://infoscience.epfl.ch/record/49936/files/WS03

--
Thomas Munro
http://www.enterprisedb.com

In response to

Re: synchronous_commit = remote_flush at 2016-08-18 18:30:57 from Jim Nasby

Responses

Re: synchronous_commit = remote_flush at 2016-08-22 02:05:43 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Christian Convey	2016-08-21 22:48:39	Re: WIP: About CMake v2
Previous Message	Andrew Gierth	2016-08-21 21:37:36	Re: SP-GiST support for inet datatypes