Re: terms for database replication: synchronous vs eager

From: Markus Schiltknecht <markus(at)bluegap(dot)ch>
To: Jan Wieck <JanWieck(at)Yahoo(dot)com>
Cc: postgres-r-general(at)pgfoundry(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: terms for database replication: synchronous vs eager
Date: 2007-09-14 09:38:23
Message-ID: 46EA568F.6080901@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Jan,

thank you for your feedback.

Jan Wieck wrote:
> On 9/7/2007 11:01 AM, Markus Schiltknecht wrote:
>> This violates the common understanding of synchrony, because you can't
>> commit on a node A and then query another node B and expect it be
>> coherent immediately.
>
> That's right. And there is no guarantee about the lag at all. So you can
> find "old" data on node B long after you committed a change to node A.

I'm in doubt about the "long after". In practice you'll mostly have
nodes which perform about equally fast. And as the origin node has to do
more processing, than a node which solely replays a transaction, it's
trivial to balance the load.

Additionally, a node which lags behind is unable to commit any
(conflicting) local transactions before having caught up (due to the GCS
total ordering). So this is even somewhat self regulating.

> Postgres-R is an asynchronous replication system by all means. It only
> makes sure that the workset data (that's what Postgres-R calls the
> replication log for one transaction)

It's most often referred to as the "writeset".

> has been received by a group
> communication system supporting total order and that the group
> communication system decided it to be the transaction that (logically)
> happened before any possibly conflicting concurrent transaction.

Correct. That's as far as the Postgres-R algorithm goes.

I should have been more precise on what I'm talking about, as I'm
continuing to develop Postgres-R (the software). That might be another
area where a new name should be introduced to differentiate between
Postgres-R, the original algorithm and my continuous work on the
software, implementing the algorithm.

> This is the wonderful idea how Postgres-R will have a failsafe conflict
> resolution mechanism in an asynchronous system.
>
> I don't know what you associate with the word "eager".

I'm speaking of the property, that a transaction is replicated before
commit, so as to avoid later conflicts. IMO, this is the only real
requirement people have when requesting synchronous replication: most
people don't need synchrony, but they need reliable commit guarantees.

I've noticed that you are simply speaking of a "failsafe conflict
resolution mechanism". I dislike that description, because is does not
say anything about *when* the conflict resolution happens WRT commit.
And there may well be lazy failsafe conflict resolutions mechanisms
(i.e. for a counter), which reconciliate after commit.

I'd like to have a simple term, so that we could say: you probably don't
need fully synchronous replication, but eager replication may already
serve you well.

> All I see is that
> Postgres-R makes sure that some other process, which might still reside
> on the same hardware as the DB, is now in charge of delivery.

..and Postgres-R waits until that other process confirms the delivery,
whatever exactly that means. See below.

This delay before commit is important. It is what makes Postgres-R
eager, according to my definition of it. I'm open for better terms.

> Nobody
> said that the GC implementation cannot have made the decision about the
> total order of two workset messages and already reported that to the
> local client application before those messages ever got transmitted over
> the wire.

While this is certainly true in theory, it does not make sense in
practice. It would mean letting the GCS decide on a message ordering
without having delivered the messages to be ordered. That would be
troublesome for the GCS, because it could loose an already ordered
message. Most GCS start their ordering algorithm by sending out the
message to be ordered.

Anyway, as I've described on -hackers before, I'm intending to decouple
replication from log writing. Thus not requiring the GCS to provide any
delivery guarantees at all (GCSs are complicated enough already!). That
would allow the user to decouple transaction processing nodes from log
writing nodes. Those tasks have different I/O requirements anyway. And
what would more that two or three replicas of the transaction logs be
good for anyway? Think of them as an efficient backup - you won't need
it until your complete cluster goes down.

Regards

Markus

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2007-09-14 13:21:35 Re: tsearch2 documentation done
Previous Message Guillaume Lelarge 2007-09-14 08:50:06 errcontext function