Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, ik(at)postgresql-consulting(dot)com, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)
Date: 2014-10-07 14:30:54
Message-ID: CA+TgmoaCntiPsVyUJNhB4kMmWz8Q-JJuKkK5OW1yobxNagiO_g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 7, 2014 at 10:12 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> Have you tried/considered putting the counters into a per-backend array
> somewhere in shared memory? That way they don't blow up the size of
> frequently ping-ponged cachelines. Then you can summarize those values
> whenever querying the results.

The problem with that is that you need O(N*M) memory instead of O(N),
where N is the number of lwlocks and M is the number of backends.
That gets painful in a hurry. We just got rid of something like that
with your patch to get rid of all the backend-local buffer pin arrays;
I'm not keen to add another such thing right back. It might be viable
if we excluded the buffer locks, but that also detracts from the
value. Plus, no matter how you slice it, you're now touching cache
lines completely unrelated to the ones you need for the foreground
work. That's got a distributed overhead that's hard to measure, but
it is certainly going to knock other stuff out of the CPU caches to
some degree.

>> As a further point, when I study the LWLOCK_STATS output, that stuff
>> is typically what I'm looking for anyway. The first few times I ran
>> with that enabled, I was kind of interested by the total lock counts
>> ... but that quickly got uninteresting. The blocking and spindelays
>> show you where the problems are, so that's the interesting part.
>
> I don't really agree with this. Especially with shared locks (even more
> so if/hwen the LW_SHARED stuff gets in), there's simply no relevant
> blocking and spindelay.

If your patch to implement lwlocks using atomics goes in, then we may
have to reassess what instrumentation is actually useful here. I can
only comment on the usefulness of various bits of instrumentation I
have used in the past on the code bases we had that time, or my
patches thereupon. Nobody here can reasonably be expected to know
whether the same stuff will still be useful after possible future
patches that are not even in a reviewable state at present have been
committed.

Having said that, if there's no blocking or spindelay any more, to me
that doesn't mean we should look for some other measure of contention
instead. It just means that the whole area is a solved problem, we
don't need to measure contention any more because there isn't any, and
we can move on to other issues once we finish partying. But mildly
skeptical that the outcome will be as good as all that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-10-07 14:36:25 Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)
Previous Message Andres Freund 2014-10-07 14:30:24 Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)