Re: Horizontal scalability/sharding

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Horizontal scalability/sharding
Date: 2015-09-03 03:01:31
Message-ID: 20150903030131.GA3789@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 2, 2015 at 09:03:25PM -0400, Robert Haas wrote:
> > Can you explain why logical replication is better than binary
> > replication for this use-case?
>
> Uh, well, for the same reasons it is better in many other cases.
> Particularly, you probably don't want to replicate all the data on
> machine A to machine B, just some of it.
>
> Typically, sharding solutions store multiple copies of each piece of
> data. So let's say you have 4 machines. You divide the data into 12
> chunks. Each machine is the write-master for 2 of those chunks, but
> has secondary copies of 3 others. So maybe things start out like
> this:
>
> machine #1: master for chunks 1, 2, 3; also has copies of chunks 4, 7, 10
> machine #2: master for chunks 4, 5, 6; also has copies of chunks 1, 8, 11
> machine #3: master for chunks 7, 8, 9; also has copies of chunks 2, 5, 12
> machine #4: master for chunks 10, 11, 12; also has copies of chunks 3, 6, 9
>
> If machine #1 is run over by a rabid triceratops, you can make machine
> #2 the master for chunk 1, machine #3 the master for chunk 2, and
> machine #4 the master for chunk 3. The write load therefore remains
> evenly divided. If you can only copy entire machines, you can't
> achieve that in this situation.

I see the advantage of this now. My original idea is that each shard
would have its own standby for disaster recovery, but your approach
above, which I know is typical, allows the shards to back up each other.
You could say shard 2 is the backup for shard 1, but then if shard one
goes bad, the entire workload of shard 1 goes to shard 2. With the
above approach, the load of shard 1 is shared by all the shards.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2015-09-03 03:02:34 Re: Horizontal scalability/sharding
Previous Message Bruce Momjian 2015-09-03 02:58:26 Re: Horizontal scalability/sharding