Re: Allow a per-tablespace effective_io_concurrency setting

From: Greg Stark <stark(at)mit(dot)edu>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>
Subject: Re: Allow a per-tablespace effective_io_concurrency setting
Date: 2015-09-03 01:44:04
Message-ID: CAM-w4HPoEmnop203thQKpFvJqCYF4HNMsdvRSzx=5Bv=T_ngAQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

That doesn't match any of the empirical tests I did at the time. I posted
graphs of the throughput for varying numbers of spindles with varying
amount of prefetch. In every case more prefetching increases throuput up to
N times the single platter throuput where N was the number of spindles.

There can be a small speedup from overlapping CPU with I/O but that's a
really small effect. At most that can be a single request and it would be a
very rarely would the amount of CPU time be even a moderate fraction of the
I/O latency. The only case where they're comparable is when your reading
sequentially and then hopefully you wouldn't be using postgres prefetching
at all which is only really intended to help random I/O.
On 2 Sep 2015 23:38, "Andres Freund" <andres(at)anarazel(dot)de> wrote:

> On 2015-09-02 19:49:13 +0100, Greg Stark wrote:
> > I can take the blame for this formula.
> >
> > It's called the "Coupon Collector Problem". If you hit get a random
> > coupon from a set of n possible coupons, how many random coupons would
> > you have to collect before you expect to have at least one of each.
>
> My point is that that's just the entirely wrong way to model
> prefetching. Prefetching can be massively beneficial even if you only
> have a single platter! Even if there were no queues on the hardware or
> OS level! Concurrency isn't the right way to look at prefetching.
>
> You need to prefetch so far ahead that you'll never block on reading
> heap pages - and that's only the case if processing the next N heap
> blocks takes longer than the prefetch of the N+1 th page. That doesn't
> mean there continously have to be N+1 prefetches in progress - in fact
> that actually often will only be the case for the first few, after that
> you hopefully are bottlnecked on CPU.
>
> If you additionally take into account hardware realities where you have
> multiple platters, multiple spindles, command queueing etc, that's even
> more true. A single rotation of a single platter with command queuing
> can often read several non consecutive blocks if they're on a similar
>
> - Andres
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2015-09-03 01:46:21 Re: Fwd: Core dump with nested CREATE TEMP TABLE
Previous Message Michael Paquier 2015-09-03 01:35:49 Re: Improving test coverage of extensions with pg_dump

Browse pgsql-performance by date

  From Date Subject
Next Message Igor Neyman 2015-09-03 13:07:21 Re: Server slowing down over time
Previous Message Andres Freund 2015-09-03 00:24:48 Re: Allow a per-tablespace effective_io_concurrency setting