From: | Mark Mielke <mark(at)mark(dot)mielke(dot)cc> |
---|---|
To: | Matthew <matthew(at)flymine(dot)org> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: RAID arrays and performance |
Date: | 2007-12-04 16:11:28 |
Message-ID: | 47557C30.10809@mark.mielke.cc |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Matthew wrote:
> On Tue, 4 Dec 2007, Mark Mielke wrote:
>
>>> The larger the set of requests, the closer the performance will scale to
>>> the number of discs
>>>
>> This assumes that you can know which pages to fetch ahead of time -
>> which you do not except for sequential read of a single table.
>>
> There are circumstances where it may be hard to locate all the pages ahead
> of time - that's probably when you're doing a nested loop join. However,
> if you're looking up in an index and get a thousand row hits in the index,
> then there you go. Page locations to load.
>
Sure.
>> Please show one of your query plans and how you as a person would design
>> which pages to request reads for.
>>
> How about the query that "cluster <skrald(at)amossen(dot)dk>" was trying to get
> to run faster a few days ago? Tom Lane wrote about it:
>
> | Wouldn't help, because the accesses to "questions" are not the problem.
> | The query's spending nearly all its time in the scan of "posts", and
> | I'm wondering why --- doesn't seem like it should take 6400msec to fetch
> | 646 rows, unless perhaps the data is just horribly misordered relative
> | to the index.
>
> Which is exactly what's going on. The disc is having to seek 646 times
> fetching a single row each time, and that takes 6400ms. He obviously has a
> standard 5,400 or 7,200 rpm drive with a seek time around 10ms.
>
Your proposal would not necessarily improve his case unless he also
purchased additional disks, at which point his execution time may be
different. More speculation. :-)
It seems reasonable - but still a guess.
> Or on a similar vein, fill a table with completely random values, say ten
> million rows with a column containing integer values ranging from zero to
> ten thousand. Create an index on that column, analyse it. Then pick a
> number between zero and ten thousand, and
>
> "SELECT * FROM table WHERE that_column = the_number_you_picked
This isn't a real use case. Optimizing for the worst case scenario is
not always valuable.
Cheers,
mark
--
Mark Mielke <mark(at)mielke(dot)cc>
From | Date | Subject | |
---|---|---|---|
Next Message | Matthew | 2007-12-04 16:16:41 | Re: RAID arrays and performance |
Previous Message | Mark Mielke | 2007-12-04 16:06:42 | Re: RAID arrays and performance |