From: | "Zeugswetter Andreas ADI SD" <ZeugswetterA(at)spardat(dot)at> |
---|---|
To: | "CK Tan" <cktan(at)greenplum(dot)com>, "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> |
Cc: | "Luke Lonergan" <LLonergan(at)greenplum(dot)com>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>, "Jeff Davis" <pgsql(at)j-davis(dot)com>, "Simon Riggs" <simon(at)enterprisedb(dot)com> |
Subject: | Re: Seq scans roadmap |
Date: | 2007-05-10 10:14:11 |
Message-ID: | E1539E0ED7043848906A8FF995BDA57901FD995A@m0143.s-mxs.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> In reference to the seq scans roadmap, I have just submitted
> a patch that addresses some of the concerns.
>
> The patch does this:
>
> 1. for small relation (smaller than 60% of bufferpool), use
> the current logic 2. for big relation:
> - use a ring buffer in heap scan
> - pin first 12 pages when scan starts
> - on consumption of every 4-page, read and pin the next 4-page
> - invalidate used pages of in the scan so they do not
> force out other useful pages
A few comments regarding the effects:
I do not see how this speedup could be caused by readahead, so what are
the effects ?
(It should make no difference to do the CPU work for count(*) inbetween
reading each block when the pages are not dirtied)
Is the improvement solely reduced CPU because no search for a free
buffer is needed and/or L2 cache locality ?
What effect does the advance pinnig have, avoid vacuum ?
A 16 x 8k page ring is too small to allow the needed IO blocksize of
256k.
The readahead is done 4 x one page at a time (=32k).
What is the reasoning behind 1/4 ring for readahead (why not 1/2), is
3/4 the trail for followers and bgwriter ?
I think in anticipation of doing a single IO call for more that one
page, the KillAndReadBuffer function should be split into two parts. One
that does the killing
for n pages, and one that does the reading for n pages.
Killing n before reading n would also have the positive effect of
grouping perhaps needed writes (not interleaving them with the reads).
I think the 60% Nbuffers is a very good starting point. I would only
introduce a GUC when we see evidence that it is needed (I agree with
Simon's partitioning comments, but I'd still wait and see).
Andreas
From | Date | Subject | |
---|---|---|---|
Next Message | Zeugswetter Andreas ADI SD | 2007-05-10 10:51:55 | Re: Seq scans roadmap |
Previous Message | Dave Page | 2007-05-10 10:05:39 | Re: Windows Vista support (Buildfarm Vaquita) |