Re: modeling parallel contention (was: Parallel Append implementation)

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: modeling parallel contention (was: Parallel Append implementation)
Date: 2017-05-05 21:42:32
Message-ID: CAH2-Wzkhso9LTHKHW+KxZt=CEV1=T0wHpptkSz29oJcpDs02UQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 5, 2017 at 12:40 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> One idea that crossed my mind is to just have workers write all of
> their output tuples to a temp file and have the leader read them back
> in. At some cost in I/O, this would completely eliminate the overhead
> of workers waiting for the leader. In some cases, it might be worth
> it. At the least, it could be interesting to try a prototype
> implementation of this with different queries (TPC-H, maybe) and see
> what happens. It would give us some idea how much of a problem
> stalling on the leader is in practice. Wait event monitoring could
> possibly also be used to figure out an answer to that question.

The use of temp files in all cases was effective in my parallel
external sort patch, relative to what I imagine an approach built on a
gather node would get you, but not because of the inherent slowness of
a Gather node. I'm not so sure that Gather is actually inherently
slow, given the interface it supports.

While incremental, retail processing of each tuple is flexible and
composable, it will tend to be slow compared to an approach based on
batch processing (for tasks where you happen to be able to get away
with batch processing). This is true for all the usual reasons --
better locality of access, better branch prediction properties, lower
"effective instruction count" due to having very tight inner loops,
and so on.

I agree with Andres that we shouldn't put too much effort into
modelling concurrency ahead of optimizing serial performance. The
machine's *aggregate* memory bandwidth should be used as efficiently
as possible, and parallelism is just one (very important) tool for
making that happen.

--
Peter Geoghegan

VMware vCenter Server
https://www.vmware.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message MauMau 2017-05-05 22:07:18 Re: [patch] Build pgoutput with MSVC
Previous Message Andres Freund 2017-05-05 21:24:06 Re: modeling parallel contention (was: Parallel Append implementation)