From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)2ndquadrant(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Scaling shared buffer eviction |
Date: | 2014-09-11 14:15:57 |
Message-ID: | CA+Tgmoatoh6c2vWdNgEDOK7vn4iaCdmLuMjw8sebdL29Wu06CQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Sep 11, 2014 at 10:03 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2014-09-11 09:48:10 -0400, Robert Haas wrote:
>> On Thu, Sep 11, 2014 at 9:22 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> > I wonder if we should recheck the number of freelist items before
>> > sleeping. As the latch currently is reset before sleeping (IIRC) we
>> > might miss being woken up soon. It very well might be that bgreclaim
>> > needs to run for more than one cycle in a row to keep up...
>>
>> The outer loop in BgMoveBuffersToFreelist() was added to address
>> precisely this point, which I raised in a previous review.
>
> Hm, right. But then let's move BgWriterStats.m_buf_alloc =+,
> ... pgstat_send_bgwriter(); into that loop. Otherwise it'd possibly end
> up being continously busy without being visible.
Good idea.
>> I'm not
>> blind to the possibility that the current logic is inadequate, but
>> testing proves that it works well enough to produce a massive
>> performance boost over where we are now.
>
> But, to be honest, the testing so far was pretty "narrow" in the kind of
> workloads that were run if I crossread things accurately. Don't get me
> wrong, I'm *really* happy about having this patch, that just doesn't
> mean every detail is right ;)
Oh, sure. Totally agreed. And, to the extent that we're improving
things based on actual testing, I'm A-OK with that. I just don't want
to start speculating, or we'll never get this thing off the ground.
Some possibly-interesting test cases would be:
(1) A read-only pgbench workload that is just a tiny bit larger than
shared_buffers, say size of shared_buffers plus 0.01%. Such workloads
tend to stress buffer eviction heavily.
(2) A workload that maximizes the rate of concurrent buffer eviction
relative to other tasks. Read-only pgbench is not bad for this, but
maybe somebody's got a better idea.
As I sort of mentioned in what I was writing for the bufmgr README,
there are, more or less, three ways this can fall down, at least that
I can see: (1) if the high water mark is too high, then we'll start
finding buffers in the freelist that have already been touched since
we added them: (2) if the low water mark is too low, the freelist will
run dry; and (3) if the low and high water marks are too close
together, the bgreclaimer will be constantly getting woken up and
going to sleep again. I can't personally think of a workload that
will enable us to get a better handle on those cases than
high-concurrency pgbench, but you're known to be ingenious at coming
up with destruction workloads, so if you have an idea, by all means
fire away.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Fabien COELHO | 2014-09-11 14:16:53 | Re: pgbench throttling latency limit |
Previous Message | Tom Lane | 2014-09-11 14:11:45 | Re: bad estimation together with large work_mem generates terrible slow hash joins |