From: | Rosser Schwarz <rosser(dot)schwarz(at)gmail(dot)com> |
---|---|
To: | Glyn Astill <glynastill(at)yahoo(dot)co(dot)uk> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Linux I/O schedulers - CFQ & random seeks |
Date: | 2011-03-04 19:09:34 |
Message-ID: | AANLkTimCtLH12VOc4Tdo+zOqLARCQsXoVLx3oR-XOT2e@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Fri, Mar 4, 2011 at 10:34 AM, Glyn Astill <glynastill(at)yahoo(dot)co(dot)uk> wrote:
> I'm wondering (and this may be a can of worms) what peoples opinions are on these schedulers? I'm going to have to do some real world testing myself with postgresql too, but initially was thinking of switching from our current CFQ back to deadline.
It was a few years ago now, but I went through a similar round of
testing, and thought CFQ was fine, until I deployed the box. It fell
on its face, hard. I can't find a reference offhand, but I remember
reading somewhere that CFQ is optimized for more desktop type
workloads, and that in its efforts to ensure fair IO access for all
processes, it can actively interfere with high-concurrency workloads
like you'd expect to see on a DB server -- especially one as big as
your specs indicate. Then again, it's been a few years, so the
scheduler may have improved significantly in that span.
My standard approach since has just been to use no-op. We've shelled
out enough money for a RAID controller, if not a SAN, so it seems
silly to me not to defer to the hardware, and let it do its job. With
big caches, command queueing, and direct knowledge of how the data is
laid out on the spindles, I'm hard-pressed to imagine a scenario where
the kernel is going to be able to do a better job of IO prioritization
than the controller.
I'd absolutely recommend testing with pg, so you can get a feel for
how it behaves under real-world workloads. The critical thing there
is that your testing needs to create workloads that are in the
neighborhood of what you'll see in production. In my case, the final
round of testing included something like 15-20% of the user-base for
the app the db served, and everything seemed fine. Once we opened the
flood-gates, and all the users were hitting the new db, though,
nothing worked for anyone. Minute-plus page-loads across the board,
when people weren't simply timing out.
As always, YMMV, the plural of anecdote isn't data, &c.
rls
--
:wq
From | Date | Subject | |
---|---|---|---|
Next Message | Pierre C | 2011-03-05 07:54:27 | Re: Calculating 95th percentiles |
Previous Message | Kevin Grittner | 2011-03-04 19:07:00 | Re: Linux I/O schedulers - CFQ & random seeks |