From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Jim Nasby <jim(at)nasby(dot)net>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Daniel Farina <daniel(at)heroku(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Should I implement DROP INDEX CONCURRENTLY? |
Date: | 2012-01-04 01:34:34 |
Message-ID: | CA+TgmoYtQ2NpE3d7iavsNn-=SxwZJmF-DjSpkhdcu7i5kBViQg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Jan 3, 2012 at 7:11 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Jim Nasby <jim(at)nasby(dot)net> writes:
>> Yeah, but the problem we run into is that with every backend trying to run the clock on it's own we end up with high contention again... it's just in a different place than when we had a true LRU. The clock sweep might be cheaper than the linked list was, but it's still awfully expensive. I believe our best bet is to have a free list that is actually useful in normal operations, and then optimize the cost of pulling buffers out of that list as much as possible (and let the bgwriter deal with keeping enough pages in that list to satisfy demand).
>
> Well, maybe, but I think the historical evidence suggests that that
> approach will be a loser, simply because no matter how cheap, the
> freelist will remain a centralized and heavily contended data structure.
> IMO we need to be looking for a mechanism that has no single point of
> contention, and modifying the clock sweep rules looks like the best way
> to get there.
>
> Still, he who wants to do the work can try whatever approach he feels
> like.
It might be possible to partition the free list. So imagine, for
example, 8 free lists. The background writer runs the clock sweep and
finds some buffers that are about ready to be reallocated and
distributes one-eighth of them to each free list. Then, when a
backend wants to allocate a buffer, it picks a free list somehow
(round robin?) and pulls a buffer off it. If the buffer turns out to
have been used since it was added to the free list, we give up on it
and try again. This hopefully shouldn't happen too often, though, as
long as we only add enough buffers to the free list to satisfy the
requests that we expect to get over the next
small-fraction-of-a-second.
Of course you have to think about what happens if you find that your
chosen free list is empty. In that case you probably want to try a
different one, and if in the worst case where they're all empty, run
the clock sweep in the foreground. You probably also want to kick the
background writer in the pants if even a single one is empty (or
nearly empty?) and tell it to hurry up and add some more buffers to
the freelist.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2012-01-04 01:37:58 | Re: SEGFAULT on SELECT * FROM view |
Previous Message | Robert Haas | 2012-01-04 01:27:26 | Re: our buffer replacement strategy is kind of lame |