Re: Horizontal scalability/sharding

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Mason S <masonlists(at)gmail(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Horizontal scalability/sharding
Date: 2015-09-01 22:15:08
Message-ID: CAM3SWZQ_4Emn5MQG3vAAtfGWCG2CZx3ROuokxaHV7zSDewxMbA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 1, 2015 at 10:17 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Sep 1, 2015 at 1:06 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>> You're assuming that our primary bottleneck for writes is IO. It's not
>> at present for most users, and it certainly won't be in the future. You
>> need to move your thinking on systems resources into the 21st century,
>> instead of solving the resource problems from 15 years ago.
>
> Your experience doesn't match mine. I find that it's frequently
> impossible to get the system to use all of the available CPU capacity,
> either because you're bottlenecked on locks or because you are
> bottlenecked on the I/O subsystem, and with the locking improvements
> in newer versions, the former is becoming less and less common.

I think you're both right. I think that we need to fix the buffer
manager, to make its caching algorithm smarter. Since we're mostly
using the filesystem cache, this is particularly important for
PostgreSQL. We need to remember usage information for evicted blocks
for some period of time afterwards. This problem is largely a problem
with Postgres in particular, I suspect.

At the same time, I agree with Josh's assessment that long-term, we
are going to have the biggest problem with memory latency and memory
bandwidth, which are usually considered facets of CPU performance, and
with internal lock contention. Addressing the memory access bottleneck
dovetails with parallelism, in that it must be considered alongside
parallelism. Josh's "21st century" remark seems quite justified to me.
For a further example of this, check out my latest progress with
external sorting, which I plan to post later in the week. I/O isn't
the big problem there at all, and I now think we can make external
sorts close to internal sorts in performance across the board.

I imagine that Josh's experience is based on workloads that mostly fit
in shared_buffers, so I can see why you'd disagree if that was
something you've seen less of. I'll quote Hellerstein and Stonebraker
in 2007 [1]:

"Copying data in memory can be a serious bottleneck. Copies contribute
latency, consume CPU cycles, and can flood the CPU data. This fact is
often a surprise to people who have not operated or implemented a
database system, and assume that main-memory operations are “free”
compared to disk I/O. But in practice, throughput in a well-tuned
transaction processing DBMS is typically not I/O-bound. This is
achieved in high-end installations by purchasing sufficient disks and
RAM so that repeated page requests are absorbed by the buffer pool,
and disk I/Os are shared across the disk arms at a rate that can feed
the data appetite of all the processors in the system."

>> Our real future bottlenecks are:
>>
>> * ability to handle more than a few hundred connections
>> * locking limits on the scalability of writes
>> * ability to manage large RAM and data caches
>
> I do agree that all of those things are problems, FWIW.

These seem like our long term problems, to me.

[1] http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf, page 213
--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2015-09-01 22:23:28 Re: Fwd: Core dump with nested CREATE TEMP TABLE
Previous Message Jim Nasby 2015-09-01 22:13:12 Re: [PATCH] SQL function to report log message