From: | Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> |
---|---|
To: | Curt Sampson <cjs(at)cynic(dot)net> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Sequential Scan Read-Ahead |
Date: | 2002-04-24 14:08:49 |
Message-ID: | 200204241408.g3OE8nI10801@candle.pha.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Curt Sampson wrote:
> At 12:41 PM 4/23/02 -0400, Bruce Momjian wrote:
>
> >This is an interesting point, that an index scan may fit in the cache
> >while a sequential scan may not.
>
> If so, I would expect that the number of pages read is significantly
> smaller than it was with a sequential scan. If that's the case,
> doesn't that mean that the optimizer made the wrong choice anyway?
>
> BTW, I just did a quick walk down this chain of code to see what happens
> during a sequential scan:
>
> access/heap/heapam.c
> storage/buffer/bufmgr.c
> storage/smgr/smgr.c
> storage/smgr/md.c
>
> and it looks to me like individual reads are being done in BLKSIZE
> chunks, whether we're scanning or not.
>
> During a sequential scan, I've heard that it's more efficient to
> read in multiples of your blocksize, say, 64K chunks rather than
> 8K chunks, for each read operation you pass to the OS. Does anybody
> have any experience to know if this is indeed the case? Has anybody
> ever added this to postgresql and benchmarked it?
>
> Certainly if there's a transaction based limit on disk I/O, as well
> as a throughput limit, it would be better to read in larger chunks.
We expect the file system to do re-aheads during a sequential scan.
This will not happen if someone else is also reading buffers from that
table in another place.
--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2002-04-24 14:20:50 | Re: namedatalen part 2 (cont'd) |
Previous Message | Mike Biamonte | 2002-04-24 14:05:42 | WAL -> Replication |