Quick Links

Re: Horizontal scalability/sharding

From:	Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Horizontal scalability/sharding
Date:	2015-09-01 18:36:03
Message-ID:	55E5F013.1020803@2ndquadrant.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

On 08/31/2015 10:16 PM, Josh Berkus wrote:
> It's also important to recognize that there are three major use-cases
> for write-scalable clustering:
>
> * OLTP: small-medium cluster, absolute ACID consistency,
> bottlnecked on small writes per second
> * DW: small-large cluster, ACID optional,
> bottlenecked on bulk reads/writes
> * Web: medium to very large cluster, ACID optional,
> bottlenecked on # of connections
>
> We cannot possibly solve all of the above at once, but to the extent
> that we recognize all 3 use cases, we can build core features which
> can be adapted to all of them.

It would be good to have a discussion about use-cases first - each of us
is mostly concerned about the use cases they're dealing with, with
bottlenecks specific to their environment. These three basic use-cases
seem like a good start, but some of the details certainly don't match my
experience ...

For example I can't see how ACID can be optional for the DWH use-case,
but maybe there's a good explanation - I can imagine sacrificing various
ACID properties at the node level, but I can't really imagine
sacrificing any of the ACID properties for the cluster as a whole. So
this would deserve some explanation.

I also don't share the view that write scalability is the only (or even
main) issue, that we should aim to solve. For the business-intelligence
use cases I've been working on recently, handling complex read-only
ad-hoc queries is often much more important. And in those cases the
bottleneck is often CPU and/or RAM.

>
> I'm also going to pontificate that, for a future solution, we should
> not focus on write *IO*, but rather on CPU and RAM. The reason for
> this thinking is that, with the latest improvements in hardware and
> 9.5 improvements, it's increasingly rare for machines to be
> bottlenecked on writes to the transaction log (or the heap). This has
> some implications for system design. For example, solutions which
> require all connections to go through a single master node do not
> scale sufficiently to be worth bothering with.

> On some other questions from Mason:
>
>> Do we want multiple copies of shards, like the pg_shard approach? Or
>> keep things simpler and leave it up to the DBA to add standbys?
>
> We want multiple copies of shards created by the sharding system itself.
> Having a separate, and completely orthagonal, redundancy system to the
> sharding system is overly burdensome on the DBA and makes low-data-loss
> HA impossible.

IMHO it'd be quite unfortunate if the design would make it impossible to
combine those two features (e.g. creating standbys for shards and
failing over to them).

It's true that solving HA at the sharding level (by keeping multiple
copies of a each shard) may be simpler than combining sharding and
standbys, but I don't see why it makes low-data-loss HA impossible.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Re: Horizontal scalability/sharding at 2015-08-31 20:16:21 from Josh Berkus

Responses

Re: Horizontal scalability/sharding at 2015-09-01 19:19:40 from Josh Berkus

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Robert Haas	2015-09-01 18:40:36	Re: 9.4 broken on alpha
Previous Message	Peter Geoghegan	2015-09-01 18:23:44	Re: [PROPOSAL] Effective storage of duplicates in B-tree index.