| From: | Greg Stark <gsstark(at)mit(dot)edu> | 
|---|---|
| To: | Hannu Krosing <hannu(at)skype(dot)net> | 
| Cc: | Greg Stark <gsstark(at)mit(dot)edu>, josh(at)agliodbs(dot)com, David Fetter <david(at)fetter(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org | 
| Subject: | Re: More thoughts about planner's cost estimates | 
| Date: | 2006-06-03 21:38:35 | 
| Message-ID: | 874pz1din8.fsf@stark.xeocode.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Hannu Krosing <hannu(at)skype(dot)net> writes:
> Disks can read at full rotation speed, so skipping (not reading) some
> blocks will not make reading the remaining blocks from the same track
> faster. And if there are more than 20 8k pages per track, you still have
> a very high probablility you need to read all tracks..
Well, if there are exactly 20 you would expect a 50% chance of having to read
a track so you would expect double the effective bandwidth. It would have to
be substantially bigger to not have any noticeable effect.
> You may be able to move to the next track a little earlier compared to
> reading all blocks, but then you are likely to miss the block from next
> track and have to wait a full rotation.
No, I don't think that works out. You always have a chance of missing the
block from the next track and having to wait a full rotation and your chance
isn't increased or decreased by seeking earlier in the rotation. So you would
expect each track to take <20 block reads> less time on average.
> Your test program could have got a little better results, if you had
> somehow managed to tell the system all the block numbers to read in one
> go, not each time the next one after hetting the previous one. 
I was trying to simulate the kind of read pattern that postgres generates
which I believe looks like that.
> The fact that 5% was not slower than seqscan seems to indicate that
> actually all track reads were cached inside the disk or controller.
I dunno, your first explanation seemed pretty convincing and doesn't depend on
specific assumptions about the caching. Moreover this doesn't explain why you
*do* get a speedup when reading less than 5%.
Perhaps what this indicates is that the real meat is in track sampling, not
block sampling.
-- 
greg
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andrew Dunstan | 2006-06-03 23:12:24 | Re: COPY (query) TO file | 
| Previous Message | Hannu Krosing | 2006-06-03 19:18:14 | Re: More thoughts about planner's cost estimates |