From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | Peter Geoghegan <pg(at)heroku(dot)com> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Clock sweep not caching enough B-Tree leaf pages? |
Date: | 2014-04-16 09:18:52 |
Message-ID: | 20140416091852.GA16358@awork2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2014-04-16 01:58:23 -0700, Peter Geoghegan wrote:
> On Wed, Apr 16, 2014 at 12:53 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > I think this is unfortunately completely out of question. For one a
> > gettimeofday() for every uffer pin will become a significant performance
> > problem. Even the computation of the xact/stm start/stop timestamps
> > shows up pretty heavily in profiles today - and they are far less
> > frequent than buffer pins. And that's on x86 linux, where gettimeofday()
> > is implemented as something more lightweight than a full syscall.
>
> Come on, Andres. Of course exactly what I've done here is completely
> out of the question as a patch that we can go and commit right now.
> I've numerous caveats about bloating the buffer descriptors, and about
> it being a proof of concept. I'm pretty sure we can come up with a
> scheme to significantly cut down on the number of gettimeofday() calls
> if it comes down to it. In any case, I'm interested in advancing our
> understanding of the problem right now. Let's leave the minutiae to
> one side for the time being.
*I* don't think any scheme that involves measuring the time around
buffer pins is going to be acceptable. It's better than I say that now
rather than when you've invested significant time into the approach, no?
> > The other significant problem I see with this is that its not adaptive
> > to the actual throughput of buffers in s_b. In many cases there's
> > hundreds of clock cycles through shared buffers in 3 seconds. By only
> > increasing the usagecount that often you've destroyed the little
> > semblance to a working LRU there is right now.
>
> If a usage_count can get to BM_MAX_USAGE_COUNT from its initial
> allocation within an instant, that's bad. It's that simple. Consider
> all the ways in which that can happen almost by accident.
Yes, I agree that that's a problem. It immediately going down to zero is
a problem as well though. And that's what will happen in many scenarios,
because you have time limits on increasing the usagecount, but not when
decreasing.
> > It also wouldn't work well for situations with a fast changing
> > workload >> s_b. If you have frequent queries that take a second or so
> > and access some data repeatedly (index nodes or whatnot) only increasing
> > the usagecount once will mean they'll continually fall back to disk access.
>
> No, it shouldn't, because there is a notion of buffers getting a fair
> chance to prove themselves.
If you have a workload with > (BM_MAX_USAGE_COUNT + 1) clock
cycles/second, how does *any* buffer has a chance to prove itself?
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Petr Jelinek | 2014-04-16 09:27:41 | Dynamic Background Workers and clean exit |
Previous Message | Andreas 'ads' Scherbaum | 2014-04-16 09:12:11 | Re: Patch: iff -> if |