| From: | Jeff Davis <pgsql(at)j-davis(dot)com> | 
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
| Cc: | Simon Riggs <simon(at)2ndquadrant(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>, Hannu Krosing <hannu(at)skype(dot)net>, Luke Lonergan <llonergan(at)greenplum(dot)com>, pgsql-hackers(at)postgresql(dot)org, Eng <eng(at)intranet(dot)greenplum(dot)com> | 
| Subject: | Re: old synchronized scan patch | 
| Date: | 2006-12-07 20:18:44 | 
| Message-ID: | 1165522724.2048.199.camel@dogma.v10.wvs | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On Thu, 2006-12-07 at 00:46 -0500, Tom Lane wrote:
> Jeff Davis <pgsql(at)j-davis(dot)com> writes:
> > Even if there are 50 in the pack, and 2 trailing, at any point in time
> > it's more likely that the last block number reported was reported by a
> > trailing scan. The pack might all report at once when they finally get
> > the block, but will be promptly overwritten by the continuous stream of
> > reports from trailing scans.
> 
> > However, if my analysis was really true, one might wonder how those
> > scans got that far behind in the first place.
> 
> Yah.  Something I was idly wondering about: suppose we teach ReadBuffer
> to provide an indication whether it had to issue an actual read() or
> found the block in cache?  Could it be useful to not report the block
> location to the hint area if we had to actually read()?  That would
> eliminate the immediate "pack leader" from the equation.  The problem
> is that it seems to break things for the case of the first follower
> joining a seqscan, because the original leader would never report.
> Anyone see the extra idea needed to make this work?
> 
My initial thought is that eliminating the immediate pack leader won't
do much good, because there's a good chance at least one scan is
following very closely (we would hope).
Also, there's a lot of focus here on not starting where the pack leader
is. The assumption is that the follower will be closer to the end of a
cache trail. Let's attack the problem more directly by taking the hint
and subtracting a number before choosing it as the start location.
Let's call the number M, which would be a fraction of the
effective_cache_size. Each scan would issue no reports at all until the
current page minus the starting page number for that scan was greater
than M. If we choose M conservatively, there's a very high chance that
there will exist a continuous stream of cached pages between where the
new scan starts (the hint - M) and the head of the pack.
This keeps the code very simple. I could modify my patch to do this in a
few minutes.
However, I do agree that it's always worth thinking about ways to use
the information that we do have about what's in cache at higher layers;
and also useful if higher layers can tell the caching layers what to
cache or not.
Regards,
	Jeff Davis
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Jeff Davis | 2006-12-07 22:00:40 | Re: old synchronized scan patch | 
| Previous Message | Heikki Linnakangas | 2006-12-07 19:32:48 | Dead code in _bt_split? |