From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)2ndquadrant(dot)com> |
Cc: | Merlin Moncure <mmoncure(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Clock sweep not caching enough B-Tree leaf pages? |
Date: | 2014-04-16 15:28:04 |
Message-ID: | CA+TgmobZ2aoQsuQ4C1au4DacA-fHRg436HAWrgTTB1Xtd_gPkQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Apr 16, 2014 at 10:49 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> 1. Improving the rate at which we can evict buffers, which is what
>> you're talking about here.
>>
>> 2. Improving the choice of which buffers we evict, which is what
>> Peter's talking about, or at least what I think he's talking about.
>>
>> Those things are both important, but they're different, and I'm not
>> sure that working on one precludes working on the other. There's
>> certainly the potential for overlap, but not necessarily.
>
> I don't think that that they neccessarily preclude each other
> either. But my gut feeling tells me that it'll be hard to have
> interesting algorithmic improvements on the buffer eviction choice
> because any additional complexity around that will have prohibitively
> high scalability impacts due to the coarse locking.
Doesn't that amount to giving up? I mean, I'm not optimistic about
the particular approach Peter's chosen here being practical for the
reasons that you and I already articulated. But I don't think that
means there *isn't* a viable approach; and I think Peter's test
results demonstrate that the additional complexity of a better
algorithm can more than pay for itself. That's a pretty important
point to keep in mind.
Also, I think the scalability problems around buffer eviction are
eminently solvable, and in particular I'm hopeful that Amit is going
to succeed in solving them. Suppose we have a background process
(whether the background writer or some other) that runs the clock
sweep, identifies good candidates for eviction, and pushes them on a
set of, say, 16 free-lists protected by spinlocks. (The optimal
number of free-lists probably depends on the size of shared_buffers.)
Backends try to reclaim by popping buffers off of one of these
free-lists and double-checking whether the page is still a good
candidate for eviction (i.e. it's still clean and unpinned). If the
free list is running low, they kick the background process via a latch
to make sure it's awake and working to free up more stuff, and if
necessary, advance the clock sweep themselves. This can even be done
by multiple processes at once, if we adopt your idea of advancing the
clock sweep hand by N buffers at a time and then scanning them
afterwards without any global lock. In such a world, it's still not
permissible for reclaim calculations to be super-complex, but you hope
that most of the activity is happening in the background process, so
cycle-shaving becomes less critical.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Marco Atzeri | 2014-04-16 15:33:44 | Re: test failure on latest source |
Previous Message | Andres Freund | 2014-04-16 15:28:03 | Re: bgworker crashed or not? |