From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Bruce Momjian <bruce(at)momjian(dot)us>, ik(at)postgresql-consulting(dot)com, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept) |
Date: | 2014-10-07 14:12:28 |
Message-ID: | 20141007141228.GF22022@awork2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2014-10-07 10:04:38 -0400, Robert Haas wrote:
> On Tue, Oct 7, 2014 at 8:03 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > On Fri, Oct 3, 2014 at 06:06:24PM -0400, Bruce Momjian wrote:
> >> > I actually don't think that's true. Every lock acquiration implies a
> >> > number of atomic locks. Those are expensive. And if you see individual
> >> > locks acquired a high number of times in multiple proceses that's
> >> > something important. It causes significant bus traffic between sockets,
> >> > while not necessarily visible in the lock held times.
> >>
> >> True, but I don't think users are going to get much value from those
> >> numbers, and they are hard to get. Server developers might want to know
> >> lock counts, but in those cases performance might not be as important.
> >
> > In summary, I think there are three measurements we can take on locks:
> >
> > 1. lock wait, from request to acquisition
> > 2. lock duration, from acquisition to release
> > 3. lock count
> >
> > I think #1 is the most useful, and can be tracked by scanning a single
> > PGPROC lock entry per session (as already outlined), because you can't
> > wait on more than one lock at a time.
> >
> > #2 would probably require multiple PGPROC lock entries, though I am
> > unclear how often a session holds multiple light-weight locks
> > concurrently. #3 might require global counters in memory.
> >
> > #1 seems the most useful from a user perspective, and we can perhaps
> > experiment with #2 and #3 once that is done.
>
> I agree with some of your thoughts on this, Bruce, but there are some
> points I'm not so sure about.
>
> I have a feeling that any system that involves repeatedly scanning the
> procarray will either have painful performance impact (if it's
> frequent) or catch only a statistically insignificant fraction of lock
> acquisitions (if it's infrequent).
Agreed.
> I think the easiest way to measure lwlock contention would be to put
> some counters in the lwlock itself. My guess, based on a lot of
> fiddling with LWLOCK_STATS over the years, is that there's no way to
> count lock acquisitions and releases without harming performance
> significantly - no matter where we put the counters, it's just going
> to be too expensive. However, I believe that incrementing a counter -
> even in the lwlock itself - might not be too expensive if we only do
> it when (1) a process goes to sleep or (2) spindelays occur.
Increasing the size will be painful on its own :(.
Have you tried/considered putting the counters into a per-backend array
somewhere in shared memory? That way they don't blow up the size of
frequently ping-ponged cachelines. Then you can summarize those values
whenever querying the results.
> As a further point, when I study the LWLOCK_STATS output, that stuff
> is typically what I'm looking for anyway. The first few times I ran
> with that enabled, I was kind of interested by the total lock counts
> ... but that quickly got uninteresting. The blocking and spindelays
> show you where the problems are, so that's the interesting part.
I don't really agree with this. Especially with shared locks (even more
so if/hwen the LW_SHARED stuff gets in), there's simply no relevant
blocking and spindelay.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2014-10-07 14:16:45 | Re: Promise index tuples for UPSERT |
Previous Message | Tom Lane | 2014-10-07 14:09:46 | Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept) |