From: | Heikki Linnakangas <heikki(at)enterprisedb(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Jeff Davis <pgsql(at)j-davis(dot)com>, Jim Nasby <decibel(at)decibel(dot)org>, Luke Lonergan <LLonergan(at)greenplum(dot)com>, Grzegorz Jaskiewicz <gj(at)pointblue(dot)com(dot)pl>, PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Doug Rady <drady(at)greenplum(dot)com>, Sherry Moore <sherry(dot)moore(at)sun(dot)com> |
Subject: | Re: Bug: Buffer cache is not scan resistant |
Date: | 2007-03-06 18:47:35 |
Message-ID: | 45EDB747.90003@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Tom Lane wrote:
> Jeff Davis <pgsql(at)j-davis(dot)com> writes:
>> If I were to implement this idea, I think Heikki's bitmap of pages
>> already read is the way to go.
>
> I think that's a good way to guarantee that you'll not finish in time
> for 8.3. Heikki's idea is just at the handwaving stage at this point,
> and I'm not even convinced that it will offer any win. (Pages in
> cache will be picked up by a seqscan already.)
The scenario that I'm worried about is that you have a table that's
slightly larger than RAM. If you issue many seqscans on that table, one
at a time, every seqscan will have to read the whole table from disk,
even though say 90% of it is in cache when the scan starts.
This can be alleviated by using a large enough sync_scan_offset, but a
single setting like that is tricky to tune, especially if your workload
is not completely constant. Tune it too low, and you don't get much
benefit, tune it too high and your scans diverge and you lose all benefit.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2007-03-06 19:04:41 | Re: Auto creation of Partitions |
Previous Message | Teodor Sigaev | 2007-03-06 18:45:33 | Re: GIST and TOAST |