From: | Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, David Steele <david(at)pgmasters(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [PATCH] Identify LWLocks in tracepoints |
Date: | 2021-04-13 14:48:02 |
Message-ID: | CAGRY4nxkU1HksXnRA9g+SmQ82xaX-an+Ax+yfkTi8m3DiHfFZQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, 13 Apr 2021 at 21:40, Craig Ringer
<craig(dot)ringer(at)enterprisedb(dot)com> wrote:
> Findings:
>
> * A probe without arguments or with simple arguments is just a 'nop' instruction
> * Probes that require function calls, pointer chasing, other
> expression evaluation etc may impose a fixed cost to collect up
> arguments even if the probe is disabled.
> * SDT semaphores can avoid that cost but add a branch, so should
> probably be avoided unless preparing probe arguments is likely to be
> expensive.
Back to the topic directly at hand.
Attached a disassembly of what LWLockAcquire looks like now on my
current build of git master @ 5fe83adad9efd5e3929f0465b44e786dc23c7b55
. This is an --enable-debug --enable-cassert --enable-dtrace build
with -Og -ggdb3.
The three tracepoints in it are all of the form:
movzwl 0x0(%r13),%edi
call 0x801c49 <GetLWTrancheName>
nop
so it's clear we're doing redundant calls to GetLWTrancheName(), as
expected. Not ideal.
Now if I patch it to add the _ENABLED() guards on all the tracepoints,
the probes look like this:
0x0000000000803176 <+200>: cmpw $0x0,0x462da8(%rip) #
0xc65f26 <postgresql_lwlock__acquire_semaphore>
0x000000000080317e <+208>: jne 0x80328b <LWLockAcquire+477>
.... other interleaved code ...
0x000000000080328b <+477>: movzwl 0x0(%r13),%edi
0x0000000000803290 <+482>: call 0x801c49 <GetLWTrancheName>
0x0000000000803295 <+487>: nop
0x0000000000803296 <+488>: jmp 0x803184 <LWLockAcquire+214>
so we avoid the GetLWTrancheName() call at the cost of a test and
possible branch, and a small expansion in total function size. Without
the semaphores, LWLockAcquire is 463 bytes. With them, it's 524 bytes,
which is nothing to sneeze at for code that doesn't do anything
99.999% of the time, but we avoid a bunch of GetLWTrancheName() calls.
If I instead replace T_NAME() with NULL for all tracepoints in
LWLockAcquire, the disassembly shows that the tracepoints now become a
simple
0x0000000000803176 <+200>: nop
which is pretty hard to be concerned about.
So at the very least we should be calling GetLWTrancheName() once at
the start of the function if built with dtrace support and remembering
the value, instead of calling it for each tracepoint.
And omitting the tranche name looks like it might be sensible for the
LWLock code. In most places it won't matter, but LWLocks are hot
enough that it possibly might. A simple pg_regress run hits
LWLockAcquire 25 million times, so that's 75 million calls to
GetLWTrancheName().
From | Date | Subject | |
---|---|---|---|
Next Message | Fujii Masao | 2021-04-13 15:00:14 | Re: TRUNCATE on foreign table |
Previous Message | Tom Lane | 2021-04-13 14:45:28 | Re: Old Postgresql version on i7-1165g7 |