From: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us> |
Cc: | Andres Freund <andres(at)2ndquadrant(dot)com>, <ik(at)postgresql-consulting(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept) |
Date: | 2014-10-07 14:22:18 |
Message-ID: | 5433F71A.9050901@vmware.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 10/07/2014 05:04 PM, Robert Haas wrote:
> On Tue, Oct 7, 2014 at 8:03 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>> On Fri, Oct 3, 2014 at 06:06:24PM -0400, Bruce Momjian wrote:
>>>> I actually don't think that's true. Every lock acquiration implies a
>>>> number of atomic locks. Those are expensive. And if you see individual
>>>> locks acquired a high number of times in multiple proceses that's
>>>> something important. It causes significant bus traffic between sockets,
>>>> while not necessarily visible in the lock held times.
>>>
>>> True, but I don't think users are going to get much value from those
>>> numbers, and they are hard to get. Server developers might want to know
>>> lock counts, but in those cases performance might not be as important.
>>
>> In summary, I think there are three measurements we can take on locks:
>>
>> 1. lock wait, from request to acquisition
>> 2. lock duration, from acquisition to release
>> 3. lock count
>>
>> I think #1 is the most useful, and can be tracked by scanning a single
>> PGPROC lock entry per session (as already outlined), because you can't
>> wait on more than one lock at a time.
>>
>> #2 would probably require multiple PGPROC lock entries, though I am
>> unclear how often a session holds multiple light-weight locks
>> concurrently. #3 might require global counters in memory.
>>
>> #1 seems the most useful from a user perspective, and we can perhaps
>> experiment with #2 and #3 once that is done.
>
> I agree with some of your thoughts on this, Bruce, but there are some
> points I'm not so sure about.
>
> I have a feeling that any system that involves repeatedly scanning the
> procarray will either have painful performance impact (if it's
> frequent) or catch only a statistically insignificant fraction of lock
> acquisitions (if it's infrequent). The reason I think there may be a
> performance impact is because quite a number of heavily-trafficked
> shared memory structures are bottlenecked on memory latency, so it's
> easy to imagine that having an additional process periodically reading
> them would increase cache-line bouncing and hurt performance. We will
> probably need some experimentation to find the best idea.
>
> I think the easiest way to measure lwlock contention would be to put
> some counters in the lwlock itself. My guess, based on a lot of
> fiddling with LWLOCK_STATS over the years, is that there's no way to
> count lock acquisitions and releases without harming performance
> significantly - no matter where we put the counters, it's just going
> to be too expensive. However, I believe that incrementing a counter -
> even in the lwlock itself - might not be too expensive if we only do
> it when (1) a process goes to sleep or (2) spindelays occur. Those
> operations are expensive enough that I think the cost of an extra
> shared memory access won't be too significant.
FWIW, I liked Ilya's design. Before going to sleep, store the lock ID in
shared memory. When you wake up, clear it. That should be cheap enough
to have it always enabled. And it can easily be extended to other
"waits", e.g. when you're waiting for input from client.
I don't think counting the number of lock acquisition is that
interesting. It doesn't give you any information on how long the waits
were, for example. I think the question the user or DBA is trying to
answer is "Why is this query taking so long, even though the CPU is
sitting idle?". A sampling approach works well for that.
For comparison, the "perf" tool works great for figuring out where the
CPU time is spent in a process. It works by sampling. This is similar,
but for wallclock time, and that we can hopefully produce more
user-friendly output.
- Heikki
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2014-10-07 14:30:24 | Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept) |
Previous Message | Robert Haas | 2014-10-07 14:20:20 | Re: RLS - permissive vs restrictive |