Re: Removing freelist (was Re: Should I implement DROP INDEX CONCURRENTLY?)

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jim Nasby <jim(at)nasby(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Daniel Farina <daniel(at)heroku(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Removing freelist (was Re: Should I implement DROP INDEX CONCURRENTLY?)
Date: 2012-01-24 15:04:41
Message-ID: CAMkU=1zrHLMUrdr7E9UT1nFyR_eXannJPz3T3vnN+6UBx+CEJw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 23, 2012 at 5:06 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Mon, Jan 23, 2012 at 12:12 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> On Sat, Jan 21, 2012 at 5:29 PM, Jim Nasby <jim(at)nasby(dot)net> wrote:
>>>> We should also look at having the freelist do something useful, instead of just dropping it completely. Unfortunately that's probably more work...
>>
>>> That's kinda my feeling as well.  The free list in its current form is
>>> pretty much useless, but I don't think we'll save much by getting rid
>>> of it, because that's just a single test.  The expensive part of what
>>> we do while holding BufFreelistLock is, I think, iterating through
>>> buffers taking and releasing a spinlock on each one (!).
>>
>> Yeah ... spinlocks that, by definition, will be uncontested.
>
> What makes you think that they are uncontested?

It is automatically partitioned over 131,072 spinlocks if
shared_buffers=1GB, so the chances of contention seem pretty low.
If a few very hot index root blocks are heavily contested, still that
would only be
encountered a few times out of the 131,072. I guess we could count
them, similar
to your LWLOCK_STATS changes.

> Or for that matter,
> that even an uncontested spinlock operation is cheap enough to do
> while holding a badly contended LWLock?

Is the concern that getting a uncontested spinlock has lots of
instructions, or that the associated fences are expensive?

>
>> So I think
>> it would be advisable to prove rather than just assume that that's a
>> problem.
>
> It's pretty trivial to prove that there is a very serious problem with
> BufFreelistLock.  I'll admit I can't prove what the right fix is just
> yet, and certainly measurement is warranted.

From my own analysis and experiments, the mere act of getting the
BufFreelistLock when it is contended is far more important than
anything actually done while holding that lock. When the clock-sweep
is done often enough that the lock is contended, then it is done often
enough to keep the average usagecount low and so the average number of
buffers visited during a clock sweep before finding a usable one was
around 2.0. YMMV.

Can we get some people who run busy high-CPU servers that churn the
cache and think they may be getting hit with this problem post the
results of this query:

select usagecount, count(*) from pg_buffercache group by usagecount;

Cheers,

Jeff

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2012-01-24 15:07:07 Re: Removing freelist (was Re: Should I implement DROP INDEX CONCURRENTLY?)
Previous Message Robert Treat 2012-01-24 14:49:29 Re: Page Checksums