Re: I'd like to discuss scaleout at PGCon

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, MauMau <maumau307(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: I'd like to discuss scaleout at PGCon
Date: 2018-06-22 18:28:58
Message-ID: CAHyXU0xzrVfSVJTrU6g6dBhV25oM7k1m+tT75TJOK_hb4Mu2sg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 22, 2018 at 12:34 PM Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>
> On Fri, Jun 1, 2018 at 11:29:43AM -0500, Merlin Moncure wrote:
> > FWIW, Distributed analytical queries is the right market to be in.
> > This is the field in which I work, and this is where the action is at.
> > I am very, very, sure about this. My view is that many of the
> > existing solutions to this problem (in particular hadoop class
> > soltuions) have major architectural downsides that make them
> > inappropriate in use cases that postgres really shines at; direct
> > hookups to low latency applications for example. postgres is
> > fundamentally a more capable 'node' with its multiple man-millennia of
> > engineering behind it. Unlimited vertical scaling (RAC etc) is
> > interesting too, but this is not the way the market is moving as
> > hardware advancements have reduced or eliminated the need for that in
> > many spheres.
> >
> > The direction of the project is sound and we are on the cusp of the
> > point where multiple independent coalescing features (FDW, logical
> > replication, parallel query, executor enhancements) will open new
> > scaling avenues that will not require trading off the many other
> > benefits of SQL that competing contemporary solutions might. The
> > broader development market is starting to realize this and that is a
> > major driver of the recent upswing in popularity. This is benefiting
> > me tremendously personally due to having gone 'all-in' with postgres
> > almost 20 years ago :-D. (Time sure flies) These are truly
> > wonderful times for the community.
>
> While I am glad people know a lot about how other projects handle
> sharding, these can be only guides to how Postgres will handle such
> workloads. I think we need to get to a point where we have all of the
> minimal sharding-specific code features done, at least as
> proof-of-concept, and then test Postgres with various workloads like
> OLTP/OLAP and read-write/read-only. This will tell us where
> sharding-specific code will have the greatest impact.
>
> What we don't want to do is to add a bunch of sharding-specific code
> without knowing which workloads it benefits, and how many of our users
> will actually use sharding. Some projects have it done that, and it
> didn't end well since they then had a lot of product complexity with
> little user value.

Key features from my perspective:
*) fdw in parallel. how do i do it today? ghetto implemented parallel
queries with asynchronous dblink

*) column store

*) automatic partition management through shards

probably some more, gotta run :-)

merlin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua D. Drake 2018-06-22 18:36:08 Re: I'd like to discuss scaleout at PGCon
Previous Message Robbie Harwood 2018-06-22 17:56:37 Re: libpq compression