Quick Links

Re: RAID arrays and performance

From:	Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
To:	Matthew <matthew(at)flymine(dot)org>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: RAID arrays and performance
Date:	2007-12-04 16:11:28
Message-ID:	47557C30.10809@mark.mielke.cc
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Matthew wrote:
> On Tue, 4 Dec 2007, Mark Mielke wrote:
>
>>> The larger the set of requests, the closer the performance will scale to
>>> the number of discs
>>>
>> This assumes that you can know which pages to fetch ahead of time -
>> which you do not except for sequential read of a single table.
>>
> There are circumstances where it may be hard to locate all the pages ahead
> of time - that's probably when you're doing a nested loop join. However,
> if you're looking up in an index and get a thousand row hits in the index,
> then there you go. Page locations to load.
>
Sure.
>> Please show one of your query plans and how you as a person would design
>> which pages to request reads for.
>>
> How about the query that "cluster <skrald(at)amossen(dot)dk>" was trying to get
> to run faster a few days ago? Tom Lane wrote about it:
>
> | Wouldn't help, because the accesses to "questions" are not the problem.
> | The query's spending nearly all its time in the scan of "posts", and
> | I'm wondering why --- doesn't seem like it should take 6400msec to fetch
> | 646 rows, unless perhaps the data is just horribly misordered relative
> | to the index.
>
> Which is exactly what's going on. The disc is having to seek 646 times
> fetching a single row each time, and that takes 6400ms. He obviously has a
> standard 5,400 or 7,200 rpm drive with a seek time around 10ms.
>
Your proposal would not necessarily improve his case unless he also
purchased additional disks, at which point his execution time may be
different. More speculation. :-)

It seems reasonable - but still a guess.

> Or on a similar vein, fill a table with completely random values, say ten
> million rows with a column containing integer values ranging from zero to
> ten thousand. Create an index on that column, analyse it. Then pick a
> number between zero and ten thousand, and
>
> "SELECT * FROM table WHERE that_column = the_number_you_picked
This isn't a real use case. Optimizing for the worst case scenario is
not always valuable.

Cheers,
mark

--
Mark Mielke <mark(at)mielke(dot)cc>

In response to

Re: RAID arrays and performance at 2007-12-04 16:00:37 from Matthew

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Matthew	2007-12-04 16:16:41	Re: RAID arrays and performance
Previous Message	Mark Mielke	2007-12-04 16:06:42	Re: RAID arrays and performance