Re: random_page_cost vs seq_page_cost

From: Benedikt Grundmann <bgrundmann(at)janestreet(dot)com>
To: Greg Smith <greg(at)2ndQuadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: random_page_cost vs seq_page_cost
Date: 2012-01-11 08:26:52
Message-ID: 20120111082651.GV6419@ldn-qws-004.delacy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(replying just to you)
On 10/01/12 15:22, Greg Smith wrote:
> On 1/5/12 5:04 AM, Benedikt Grundmann wrote:
> That sort of thing is one reason why all attempts so far to set
> random_page_cost based on physical characteristics haven't gone
> anywhere useful. The setting is sort of overloaded right now, it's a
> fuzzy mix of true random seek cost blended with some notion of cache
> percentage. Trying to bring some measurements to bear on it is a less
> effective approach than what people actually do here. Monitor the
> profile of query execution, change the value, see what happens. Use
> that as feedback for what direction to keep going; repeat until
> you're just spinning with no improvements.
>
Thank you very much for the reply it is very interesting. I'm
excited to hear that documentation in that area will improve in
9.2. It's interesting postgres has remarkable good documentation
but it is a sufficiently complex system that to actually sensible
tune the knobs provided you have to understand quite a lot about
what is going on. A colleague of mine likes to say
"all abstractions leak", which seems very appropriate in this case.

> >We are not sure if the database used to choose differently
> >before the move to the new hardware and the hardware is
> >performing worse for random seeks. Or if the planner is
> >now making different choices.
>
> I don't recommend ever deploying new hardware without first doing
> some low-level benchmarks to validate its performance. Once stuff
> goes into production, you can't do that anymore. See
> http://www.2ndquadrant.com/en/talks/ for my hardware benchmarking
> talks if you'd like some ideas on what to collect.
>
We had actually done lots of tests on the sequential read performance.
But you are right we could have done better (and I'll definitely read
through your talks).

Did you see my follow up? Based on the feedback we did further tests
and It is now clear that neither the hardware nor the database version
are at fault. A different plan is chosen by both new and old database
version if spun up on the database as it is right now.

Our best guess is that the clusters we run after we had moved to the
hardware (it having more diskspace and faster sequential I/O making
it possible) changed the planners perception of how the joins will
perform in relation to each other.

Cheers,

Bene

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Benedikt Grundmann 2012-01-11 08:29:02 Re: random_page_cost vs seq_page_cost
Previous Message Pavel Stehule 2012-01-11 06:18:28 Re: JSON for PG 9.2