Re: High Availability, Load Balancing, and Replication Feature Matrix

From: Markus Schiltknecht <markus(at)bluegap(dot)ch>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: PostgreSQL-documentation <pgsql-docs(at)postgresql(dot)org>
Subject: Re: High Availability, Load Balancing, and Replication Feature Matrix
Date: 2007-11-12 10:57:04
Message-ID: 47383180.3000203@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-hackers

Hello Bruce,

Bruce Momjian wrote:
> Sorry, I forgot who was involved in that discussion.

Well, at least that means I didn't annoy you to death last time ;-)

>> With the other two I'm unsure.. I see it's very hard to find helpful
>> positive formulations...
>
> Yea, that's where I got stuck --- that the positives were harder to
> understand.

Okay, understood.

> Sorry, I meant that a master that is modifying data is slowed down by
> other masters to an extent that doesn't happen in other cases (e.g. with
> slaves). Is the current "No inter-server locking delay" OK?

Yes, sort of. I know what you meant, but I find it hard to understand.
And with regard to anything except lazy or eager replication, it does
not make any sense. Its pretty moot saying anything about "inter-server
locking delays" for "statement-based replication middleware": you don't
know if it's lazy or eager. And all other solutions you are mentioning
are single-master or no replication at all. When there's only one
master, it's pretty obvious that there can't be no inter-(master)-server
locking delay. (Well, it's also very obvious that a single master never
'conflicts' with itself...)

Given you want to cover existing solutions, one could say, that (AFAIK)
all statement based replication solutions are eager. But in that case,
the dot would be wrong, because the middleware would need to wait for at
least an absolute majority to confirm the commit. Which as well leads to
excessive locking, as you are saying for "synchronous multi-master
replication". Because it's a property inherent to eager multi-master
replication, as we correctly explain above the feature matrix.

>> multi-master replication" as well as "statement-based replication
>> middleware" should not have a dot, because those as well slow down other
>> masters. In the async case at different points in time, yes, but all
>> master have to write the data, which slows them down.
>
> Yea, that is why I have the new text about locking.

To me this makes it sound like "statement-based replication" could be
faster than "synchronous multi-master replication". That's absolute
nonsense, since those two don't compare. Or put it another way: most
"statement-based replication" solutions often are "synchronous
multi-master replication" as well.

[ In that sense, stating that "PostgreSQL does not offer this kind of
replication" is wrong, under "Synchronous Multi-Master Replication". As
is the assumption, that all those send "data changes". Probably you
should clarify that to say: "tuple based, eager multi-master
replication", because that's what you are talking about. ]

If you are comparing an eager, statement-based, multi-master replication
(like PgCluster) with an eager, tuple-based, multi-master replication
(like Postgres-R), the former can't possibly be faster than the later.
I.e. it certainly doesn't have less (locking?) delays.

>>> which is the reason we don't support it yet.
>> Uhm.. PgCluster *is* a synchronous multi-master replication solution. It
>> also is a middleware and it does statement based replication. Which dots
>> of the matrix do you think apply for it?
>
> I don't consider PgCluster middleware because the servers have to
> cooperate with the middleware.

Okay, then take Sequoia: statement-based, middleware, synchronous (thus
eager) multi-master replication solution.

( I've never liked the term "middleware" in that chapter. It's solely a
question of implementation and does not have much to do with other
concepts of replication. )

> And I am told it is much slower for
> writes than a single server which supports my "locking" item, though it
> is more "waiting for other masters" that is the delay, I think.

Uh.. with the dot there, you are saying that "statement based
middleware" does *not* have any inter-server locking delay.

What's the difference between "waiting for other masters" and "locking
delay"? What exactly do you consider a lock? Why should it be locking
when using binary-tuple replication, but not when using statement based
replication?

> I don't assume the disk failover has mirrored disks. It can just like a
> single server can, but it isn't part of the backend process, and I
> assume a RAID card that has RAM that can cache writes.

In that case, you'd loose the "master failure will never lose data"
property, no? Or do you trust the writeback cache and the connection to
the NAS that much as to assume it never fails?

>>> I don't think
>>> the network is an issue considering many use NAS anyway.
>> I think you are comparing an enterprise NAS to a low-cost, commodity
>> hardware clustered filesystem. Take the same amount of money and the
>> same number of mirrors and you'll get comparable performance.
>
> Agreed. In the one case you are relying on another server, and in the
> NAS case you are relying on a black box server. I think the big
> difference is that the other server is a separate entity, while the NAS
> is a shared item.

Correct, thus the former is a kind of single-master replication, while
the later cannot be considered replication (lacking a replica). It's
rather a variant of how to enhance reliability of your single-master
database server system.

>>> There is no dot there so I am saying "statement based replication
>>> solution" requires conflict resolution. Agreed you could do it without
>>> conflict resolution and it is kind of independent. How should we deal
>>> with this?
>> Maybe a third state: 'n/a'?
>
> Good idea, or "~". How would middleware avoid conflicts, i.e. how would
> it know that two incoming queries were in conflict?

A majority of servers rejecting or blocking the query? In case of a
minority, which blocks, the majority would win and apply the
transaction, while the minority would have to replay the transaction? I
don't know, probably most solutions do something simpler, like aborting
a transaction even if only one server fails. Much simpler, and
sufficient for most cases.

(Why do you ask me, I'm advocating internal, tuple level replication
with Postgres-R, not statement based one :-) )

> I did move it below and removed it from the chart because as you say how
> to replicate to the slaves is an independent issue.

Okay, I like that better, thanks.

>> With regard to replication, there's another feature I think would be
>> worth mentioning: dynamic addition or removal of nodes (masters or
>> slaves). But that's solely implementation dependent, so it probably
>> doesn't fit into the matrix.
>
> Yea, I had that but found you could add/remove slaves easily in most
> cases.

Hm.. you're right.

>> Another interesting property I'm missing is the existence of single
>> points of failures.
>
> Ah, yea, but then you get into power and fire issues.

Which high-availability is all about, no?

But well, again, all kinds of replication (which excludes the NAS) can
theoretically be spread across the continent. So it might be pretty
useless to add dots for that.

Regards

Markus

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Bruce Momjian 2007-11-12 17:56:56 Re: High Availability, Load Balancing, and Replication Feature Matrix
Previous Message Bruce Momjian 2007-11-12 03:06:10 Re: [PATCHES] Contrib docs v1

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Huxton 2007-11-12 11:55:54 Re: Clarification reqeusted for "select * from a huge table"
Previous Message Simon Riggs 2007-11-12 10:55:16 Re: [hibernate-team] PostgreSQLDialect