Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: ik(at)postgresql-consulting(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)
Date: 2014-10-03 21:15:13
Message-ID: 20141003211513.GV7158@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-10-03 11:33:18 -0400, Bruce Momjian wrote:
> On Thu, Oct 2, 2014 at 11:50:14AM +0200, Andres Freund wrote:
> > The first problem that comes to my mind about collecting enough data is
> > that we have a very large number of lwlocks (fixed_number + 2 *
> > shared_buffers). One 'trivial' way of implementing this is to have a per
> > backend array collecting the information, and then a shared one
> > accumulating data from it over time. But I'm afraid that's not going to
> > fly :(. Hm. With the above sets of stats that'd be ~50MB per backend...
> >
> > Perhaps we should somehow encode this different for individual lwlock
> > tranches? It's far less problematic to collect all this information for
> > all but the buffer lwlocks...
>
> First, I think this could be a major Postgres feature, and I am excited
> someone is working on this.
>
> As far as gathering data, I don't think we are going to do any better in
> terms of performance/simplicity/reliability than to have a single PGPROC
> entry to record when we enter/exit a lock, and having a secondary
> process scan the PGPROC array periodically.

I don't think that'll give meaningful results given the very short times
most lwlocks are held. And it'll give not very interesting results for
multiple lwlocks held at the same time - most of the time the 'earlier'
held ones are more interesting than the last acquired one...

> I am assuming almost no one cares about the number of locks, but rather
> they care about cummulative lock durations.

I actually don't think that's true. Every lock acquiration implies a
number of atomic locks. Those are expensive. And if you see individual
locks acquired a high number of times in multiple proceses that's
something important. It causes significant bus traffic between sockets,
while not necessarily visible in the lock held times.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-10-03 21:16:53 Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)
Previous Message Pavel Stehule 2014-10-03 21:14:39 Re: Trailing comma support in SELECT statements