Quick Links

Re: High Availability, Load Balancing, and Replication Feature Matrix

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Markus Schiltknecht <markus(at)bluegap(dot)ch>
Cc:	PostgreSQL-documentation <pgsql-docs(at)postgresql(dot)org>
Subject:	Re: High Availability, Load Balancing, and Replication Feature Matrix
Date:	2007-11-10 19:20:15
Message-ID:	200711101920.lAAJKFf23421@momjian.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-docs pgsql-hackers

Markus Schiltknecht wrote:
> Hello Bruce,
>
> Bruce Momjian wrote:
> > I have added a High Availability, Load Balancing, and Replication
> > Feature Matrix table to the docs:
>
> Nice work. I appreciate your efforts in clearing up the uncertainty that
> surrounds this topic.
>
> As you might have guessed, I have some complaints regarding the Feature
> Matrix. I hope this won't discourage you, but I'd rather like to
> contribute to an improved variant.

Not sure if you were around when we wrote this chapter but there was a
lot of good discussion to get it to where it is now.

> First of all, I don't quite like the negated formulations. I can see
> that you want a dot to mark a positive feature, but I find it hard to
> understand.

Well, the idea is to say "what things do I want and what offers it?" If
you have positive/negative it makes it harder to do that. I realize it
is confusing in a different way. We could split out the negatives into
a different table but that seems worse.

> I'm especially puzzled about is the "master never locks others". All
> first four, namely "shared disk failover", "file system replication",
> "warm standby" and "master slave replication", block others (the slaves)
> completely, which is about the worst kind of lock.

That item assumes you have slaves that are trying to do work. The point
is that multi-master slows down the other slaves in a way no other
option does, which is the reason we don't support it yet. I have
updated the wording to "No inter-server locking delay".

> Comparing between "File System Replication" and "Shared Disk Failover",
> you state that the former has "master server overhead", while the later
> doesn't. Seen solely from the single server node, this might be true.
> But summarized over the cluster, you have a network with a quite similar
> load in both cases. I wouldn't say one has less overhead than the other
> per definition.

The point is that file system replication has to wait for the standby
server to write the blocks, while disk failover does not. I don't think
the network is an issue considering many use NAS anyway.

> Then, you are mixing apples and oranges. Why should a "statement based
> replication solution" not require conflict resolution? You can build
> eager as well as lazy statement based replication solutions, that does
> not have anything to do with the other, does it?

There is no dot there so I am saying "statement based replication
solution" requires conflict resolution. Agreed you could do it without
conflict resolution and it is kind of independent. How should we deal
with this?

> Same applies to "master slave replication" and "per table granularity".

I tried to mark them based on existing or typical solutions, but you are
right, especially if the master/slave is not PITR based. Some can't do
per-table, like disk failover.

> And in the special case of (async, but eager) Postgres-R also to "async
> multi-master replication" and "no conflict resolution necessary".
> Although I can understand that that's a pretty nifty difference.

Yea, the table isn't going to be 100% but tries to summarize what in the
section above.

> Given the matrix focuses on practically available solutions, I can see
> some value in it. But from a more theoretical viewpoint, I find it
> pretty confusing. Now, if you want a practically usable feature
> comparison table, I'd strongly vote for clearly mentioning the products
> you have in mind - otherwise the table pretends to be something it is not.

I considered that and I can add something that says you have to consider
the text above for more details. Some require solution mentions, Slony,
while others do not, like disk failover.

> If it should be theoretically correct without mentioning available
> solutions, I'd rather vote for explaining the terms and concepts.
>
> To clarify my viewpoint, I'll quickly go over the features you're
> mentioning and associate them with the concepts, as I understand them.
>
> - special hardware: always nice, not much theoretical effect, a
> network is a network, storage is storage.
>
> - multiple masters: that's what single- vs multi masters is about:
> writing transactions. Can be mixed with
> eager/lazy, every combination makes
> sense for certain applications.
>
> - overhead: replication per definition generates overhead,
> question is: how much, and where.
>
> - locking of others: again, question of how much and how fine grained
> the locking is. In a single master repl. sol., the
> slaves are locked completely. In lazy repl. sol.,
> the locking is deferred until after the commit,
> during conflict resolution. In eager repl. sol.,
> the locking needs to take place before the commit.
> But all replication systems need some kind of
> locks!
>
> - data loss on fail: solely dependent on eager/lazy. (Given a real
> replication, with a replica, which shared storage
> does not provide, IMO)
>
> - slaves read only: theoretically possible with all replication
> system, are they lazy/eager, single-/multi-
> master. That we are unable to read from slave
> nodes is an implementation annoyance of
> Postgres, if you want.
>
> - per table gran.: again, independent of lazy/eager, single-/multi.
> Depends solely on the level where data is
> replicated: block device, file system, statement,
> WAL or other internal format.
>
> - conflict resol.: in multi master systems, that depends on the
> lazy/eager property. Single master systems
> obviously never need to resolve conflicts.

Right, but the point of the chart is go give people guidance, not to
give them details; that is in the part above.

> IMO, "data partitioning" is entirely perpendicular to replication. It
> can be combined, in various ways. There's horizontal and vertical
> partitioning, eager/lazy and single-/multi-master replication. I guess
> we could find a use case for most of the combinations thereof. (Kudos
> for finding a combination which definitely has no use case).

Really? Are you saying the office example is useless? What is a good
use case for this?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Re: High Availability, Load Balancing, and Replication Feature Matrix at 2007-11-10 16:38:43 from Markus Schiltknecht

Responses

Re: High Availability, Load Balancing, and Replication Feature Matrix at 2007-11-11 13:11:50 from Markus Schiltknecht

Browse pgsql-docs by date

	From	Date	Subject
Next Message	Bruce Momjian	2007-11-10 23:36:47	Re: Contrib docs v1
Previous Message	Markus Schiltknecht	2007-11-10 16:38:43	Re: High Availability, Load Balancing, and Replication Feature Matrix

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Gregory Stark	2007-11-10 19:36:09	Re: functions are returns columns
Previous Message	Michele Petrazzo - Unipex srl	2007-11-10 17:37:51	Re: functions are returns columns