From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Markus Schiltknecht <markus(at)bluegap(dot)ch> |
Cc: | PostgreSQL-documentation <pgsql-docs(at)postgresql(dot)org> |
Subject: | Re: High Availability, Load Balancing, and Replication Feature Matrix |
Date: | 2007-11-10 19:20:15 |
Message-ID: | 200711101920.lAAJKFf23421@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-docs pgsql-hackers |
Markus Schiltknecht wrote:
> Hello Bruce,
>
> Bruce Momjian wrote:
> > I have added a High Availability, Load Balancing, and Replication
> > Feature Matrix table to the docs:
>
> Nice work. I appreciate your efforts in clearing up the uncertainty that
> surrounds this topic.
>
> As you might have guessed, I have some complaints regarding the Feature
> Matrix. I hope this won't discourage you, but I'd rather like to
> contribute to an improved variant.
Not sure if you were around when we wrote this chapter but there was a
lot of good discussion to get it to where it is now.
> First of all, I don't quite like the negated formulations. I can see
> that you want a dot to mark a positive feature, but I find it hard to
> understand.
Well, the idea is to say "what things do I want and what offers it?" If
you have positive/negative it makes it harder to do that. I realize it
is confusing in a different way. We could split out the negatives into
a different table but that seems worse.
> I'm especially puzzled about is the "master never locks others". All
> first four, namely "shared disk failover", "file system replication",
> "warm standby" and "master slave replication", block others (the slaves)
> completely, which is about the worst kind of lock.
That item assumes you have slaves that are trying to do work. The point
is that multi-master slows down the other slaves in a way no other
option does, which is the reason we don't support it yet. I have
updated the wording to "No inter-server locking delay".
> Comparing between "File System Replication" and "Shared Disk Failover",
> you state that the former has "master server overhead", while the later
> doesn't. Seen solely from the single server node, this might be true.
> But summarized over the cluster, you have a network with a quite similar
> load in both cases. I wouldn't say one has less overhead than the other
> per definition.
The point is that file system replication has to wait for the standby
server to write the blocks, while disk failover does not. I don't think
the network is an issue considering many use NAS anyway.
> Then, you are mixing apples and oranges. Why should a "statement based
> replication solution" not require conflict resolution? You can build
> eager as well as lazy statement based replication solutions, that does
> not have anything to do with the other, does it?
There is no dot there so I am saying "statement based replication
solution" requires conflict resolution. Agreed you could do it without
conflict resolution and it is kind of independent. How should we deal
with this?
> Same applies to "master slave replication" and "per table granularity".
I tried to mark them based on existing or typical solutions, but you are
right, especially if the master/slave is not PITR based. Some can't do
per-table, like disk failover.
> And in the special case of (async, but eager) Postgres-R also to "async
> multi-master replication" and "no conflict resolution necessary".
> Although I can understand that that's a pretty nifty difference.
Yea, the table isn't going to be 100% but tries to summarize what in the
section above.
> Given the matrix focuses on practically available solutions, I can see
> some value in it. But from a more theoretical viewpoint, I find it
> pretty confusing. Now, if you want a practically usable feature
> comparison table, I'd strongly vote for clearly mentioning the products
> you have in mind - otherwise the table pretends to be something it is not.
I considered that and I can add something that says you have to consider
the text above for more details. Some require solution mentions, Slony,
while others do not, like disk failover.
> If it should be theoretically correct without mentioning available
> solutions, I'd rather vote for explaining the terms and concepts.
>
> To clarify my viewpoint, I'll quickly go over the features you're
> mentioning and associate them with the concepts, as I understand them.
>
> - special hardware: always nice, not much theoretical effect, a
> network is a network, storage is storage.
>
> - multiple masters: that's what single- vs multi masters is about:
> writing transactions. Can be mixed with
> eager/lazy, every combination makes
> sense for certain applications.
>
> - overhead: replication per definition generates overhead,
> question is: how much, and where.
>
> - locking of others: again, question of how much and how fine grained
> the locking is. In a single master repl. sol., the
> slaves are locked completely. In lazy repl. sol.,
> the locking is deferred until after the commit,
> during conflict resolution. In eager repl. sol.,
> the locking needs to take place before the commit.
> But all replication systems need some kind of
> locks!
>
> - data loss on fail: solely dependent on eager/lazy. (Given a real
> replication, with a replica, which shared storage
> does not provide, IMO)
>
> - slaves read only: theoretically possible with all replication
> system, are they lazy/eager, single-/multi-
> master. That we are unable to read from slave
> nodes is an implementation annoyance of
> Postgres, if you want.
>
> - per table gran.: again, independent of lazy/eager, single-/multi.
> Depends solely on the level where data is
> replicated: block device, file system, statement,
> WAL or other internal format.
>
> - conflict resol.: in multi master systems, that depends on the
> lazy/eager property. Single master systems
> obviously never need to resolve conflicts.
Right, but the point of the chart is go give people guidance, not to
give them details; that is in the part above.
> IMO, "data partitioning" is entirely perpendicular to replication. It
> can be combined, in various ways. There's horizontal and vertical
> partitioning, eager/lazy and single-/multi-master replication. I guess
> we could find a use case for most of the combinations thereof. (Kudos
> for finding a combination which definitely has no use case).
Really? Are you saying the office example is useless? What is a good
use case for this?
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2007-11-10 23:36:47 | Re: Contrib docs v1 |
Previous Message | Markus Schiltknecht | 2007-11-10 16:38:43 | Re: High Availability, Load Balancing, and Replication Feature Matrix |
From | Date | Subject | |
---|---|---|---|
Next Message | Gregory Stark | 2007-11-10 19:36:09 | Re: functions are returns columns |
Previous Message | Michele Petrazzo - Unipex srl | 2007-11-10 17:37:51 | Re: functions are returns columns |