Re: Big 7.4 items

From: Shridhar Daithankar <shridhar_daithankar(at)persistent(dot)co(dot)in>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Big 7.4 items
Date: 2002-12-14 05:39:30
Message-ID: 200212141109.30203.shridhar_daithankar@persistent.co.in
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Friday 13 December 2002 11:01 pm, you wrote:
> Good. This is the discussion we need. Let me quote the TODO list
> replication section first:
>
> * Add replication of distributed databases [replication]
> o automatic failover

Very good. We need that for HA.

> o load balancing

> o master/slave replication
> o multi-master replication
> o partition data across servers

I am interested in this for multitude of reasons. Scalability is obviously one
of them. But I was just wondering about some things(After going thr. all the
messages on this).

Once we have partitioning and replication, that effectively means database
cache can span multiple machines and no longer restricted by shared memory.
So will it work on mosix now? Just a thought.

> OK, the first thing is that there isn't any one replication solution
> that will behave optimally in all situations.
>
> Now, let me describe Postgres-R and then the other replication
> solutions. Postgres-R is multi-master, meaning you can send SELECT and
> UPDATE/DELETE queries to any of the servers in the cluster, and get the
> same result. It is also synchronous, meaning it doesn't update the
> local copy until it is sure the other nodes agree to the change. It
> allows failover, because if one node goes down, the others keep going.
>
> Now, let me contrast:
>
> rserv and dbmirror do master/slave. There is no mechanism to allow you
> to do updates on the slave, and have them propagate to the master. You
> can, however, send SELECT queries to the slave, and in fact that's how
> usogres does load balancing.

Seems like mirroring v/s striping to me. Can we have both combined in either
fashion just like RAID.

Most importantly will it be able to resize the cluster on the fly? Are we
looking at network management of database like Oracle does. (OK the tools are
unwarranted in many situation but it has to offer it).

Most importantly I would like to see this thing easy to setup from a one
point-of-administration view.

Something like declare a cluster of these n1 machines as database partitions
and have these another n2 machine do a slave sync with them for handling
loads. If these kind of command-line options are there, adding easy tools on
top of them should be a pop.

And please, in place database upgrades. Otherwise it will be a huge pain to
maintain such a cluster over long period of times.

> Another replication need is for asynchronous replication, most
> traditionally for traveling salesmen who need to update their databases
> periodically. The only solution I know for that is PeerDirect's
> PostgreSQL commercial offering at http://www.peerdirect.com. It is
> possible PITR may help with this, but we need to handle propagating
> changes made by the salesmen back up into the server, and to do that, we
> will need a mechanism to handle conflicts that occur when two people
> update the same records. This is always a problem for asynchronous
> replication.

We need not offer entire asynchronous replication all at once. We can have
levels of asynchronous replication like read only(Syncing essentially) and
Read-write. Even if we get slave sync only at first, that will be huge plus.

> > 2 If we are going to have replication, can we have built in load
> > balancing? Is it a good idea to have it in postgresql or a
> > separate application would be way to go?
>
> Well, because Postgres-R is multi-master, it has automatic load
> balancing. You can have your users point to whatever node you want.
> You can implement this "pointing" by using dns IP address cycling, or
> have a router that auto-load balances, though you would need to keep a
> db session on the same node, of course.

Umm. W.r.t above point i.e. combining data partitioning and slave-sync, will
it take a partitioned cluster as a single server or that cluster can take
care of itself in such situattions?

>
> So, in summary, I think we will eventually have two directions for
> replication. One is Postgres-R for multi-master, synchronous
> replication, and PITR, for asynchronous replication. I don't think

I would put that as two options rather than directions. We need to be able to
deploy them both if required.

Imagine postgresql running over 500 machine cluster..;-)

Shridhar

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Clift 2002-12-14 06:55:05 Re: Big 7.4 items
Previous Message Tom Lane 2002-12-14 04:50:52 Re: PQnotifies() in 7.3 broken?