Re: Sync Rep: First Thoughts on Code

From: "Robert Haas" <robertmhaas(at)gmail(dot)com>
To: "Jeff Davis" <pgsql(at)j-davis(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Markus Wanner" <markus(at)bluegap(dot)ch>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Fujii Masao" <masao(dot)fujii(at)gmail(dot)com>, aidan(at)highrise(dot)ca, "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sync Rep: First Thoughts on Code
Date: 2008-12-14 05:42:12
Message-ID: 603c8f070812132142n5408e7ddk899e83cddd4cb0b2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> The point here is that synchronous replication, at least to some
> people, is going to imply that the user-visible states of the two
> copies are consistent. To other people, it is going to imply that
> committed transactions will never be lost even in the event of a
> catastropic loss of the primary 1 picosecond after the commit is
> acknowledged. We need to choose some word that implies that we are
> guaranteeing the latter of these two things but not the former.
> Otherwise, we will have confused users, and terminological confusion
> when and if we ever implement the former as well.

With apologies for replying to my own post:

It's also important to understand that these two invariants are
completely separate and it is possible to guarantee either without the
other. If you want (1), the standby needs to apply the WAL before
sending an acknowledgment to the primary but does not necessarily need
to write it to disk (of course, it will have to be written to disk
before the modified buffers are written to disk, but that's a separate
issue). If you want (2), the standby needs to write the WAL to disk
before sending the acknowledgment but does not necessarily need to
apply it. If you want both, then, you need to wait for both (and it's
worth noting that your performance will probably be nothing to write
home about).

I also did some research on terminology that has been used in the
literature. As Jim Gray describes it:

1-safe replication. Transaction is committed when it has been locally
WAL-logged to durable storage.
Group-safe replication. Transaction is committed when WAL has been
received by all remote servers, but not necessarily written to durable
storage.
Group-safe & 1-safe replication. Transaction is committed when it has
been locally WAL-logged to durable storage and WAL has been received
by all remote servers.
2-safe replication. Transaction is committed when it has been written
to durable storage on both local and remote servers.
Very safe replication. As 2-safe, but fails any read-write
transaction if the secondary is down.

(Actually, it appears that "Transaction Processing" Jim Gray and
Andreas Reuter, 1993 uses 2-safe to refer to either 2-safe or
group-safe; the distinction between the two is a subsequent
development. See e.g. Advances in Database Technology-EDBT 2004
by Elisa Bertino)

The term of art for making sure that transactions committed on the
primary are visible on the secondary seems to be "one-copy
serializability" (see, for example, a Google Books search on that
term).

...Robert

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-12-14 05:55:59 Re: WIP: default values for function parameters
Previous Message Tatsuo Ishii 2008-12-14 04:31:12 Re: Sync Rep: First Thoughts on Code