Re: BUG #13493: pl/pgsql doesn't scale with cpus (PG9.3, 9.4)

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: grb(at)skogoglandskap(dot)no, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #13493: pl/pgsql doesn't scale with cpus (PG9.3, 9.4)
Date: 2015-07-08 13:22:12
Message-ID: 20150708132212.GM10242@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 2015-07-08 14:55:12 +0200, Andres Freund wrote:
> comparing stats between the 4 and 8 client runs shows (removing boring data):

> 4 clients:
> 109,655,985,749 stalled-cycles-frontend # 54.27% frontend cycles idle (27.79%)
> 41,664,400,828 branches # 458.558 M/sec (33.32%)
> 374,426,805 branch-misses # 0.90% of all branches (33.32%)

> 8 clients:
> 247,231,674,170 stalled-cycles-frontend # 67.04% frontend cycles idle (27.84%)
> 54,503,992,461 branches # 329.991 M/sec (33.39%)
> 761,911,056 branch-misses # 1.40% of all branches (33.38%)

looking at stalled-cycles-frontend, and branch-misses shows:

-e branch-misses
4:
+ 7.34% postgres plpgsql.so [.] plpgsql_exec_function
+ 6.33% postgres postgres [.] standard_ExecutorRun
+ 5.49% postgres postgres [.] SPI_push
+ 4.24% postgres libc-2.19.so [.] __memcpy_sse2_unaligned
+ 3.43% postgres postgres [.] ReleaseCachedPlan
+ 2.72% postgres plpgsql.so [.] exec_stmt
+ 2.07% postgres postgres [.] SPI_pop
+ 1.94% postgres postgres [.] ExecLimit
+ 1.94% postgres libc-2.19.so [.] __strcpy_sse2_unaligned
+ 1.59% postgres libc-2.19.so [.] _int_malloc
+ 1.46% postgres postgres [.] LWLockRelease
+ 1.33% postgres postgres [.] AllocSetAlloc
+ 1.17% postgres libc-2.19.so [.] _int_free
+ 1.08% postgres libc-2.19.so [.] memset
+ 1.00% postgres postgres [.] hash_search_with_hash_value
+ 0.99% postgres postgres [.] GetSnapshotData

8:

+ 10.66% pgbench libpq.so.5.9 [.] PQsocket
+ 8.40% pgbench libpthread-2.19.so [.] __libc_recv
+ 5.03% postgres plpgsql.so [.] plpgsql_exec_function
+ 3.00% postgres postgres [.] AllocSetAlloc
+ 2.86% postgres postgres [.] standard_ExecutorRun
+ 2.54% postgres plpgsql.so [.] exec_stmt
+ 2.45% postgres libc-2.19.so [.] __memcpy_sse2_unaligned
+ 2.27% postgres postgres [.] GetSnapshotData
+ 2.22% postgres postgres [.] ReleaseCachedPlan
+ 2.14% postgres postgres [.] OverrideSearchPathMatchesCurrent
+ 2.06% postgres postgres [.] SPI_push
+ 2.02% postgres postgres [.] SPI_pop
+ 1.92% postgres libc-2.19.so [.] _int_malloc
+ 1.86% postgres postgres [.] ExecLimit
+ 1.77% postgres postgres [.] SPI_connect
+ 1.74% postgres libc-2.19.so [.] __strcpy_sse2_unaligned

It's interesting to see that the limited size of the trace buffer leads
to previously perfectly predicted functions like PQsocket regularly
causing cache misses now... Interesting to see how GetSnapshotData()
rises in comparison.

OverrideSearchPathMatchesCurrent() is also curious, but perhaps not so
surprising considering it's chasing down a linked list - almost
impossible to predict.

-e stalled-cycles-frontend:
4:
+ 21.50% swapper [kernel.vmlinux] [k] intel_idle
+ 4.21% pgbench libpthread-2.19.so [.] __libc_recv
+ 2.68% postgres postgres [.] LWLockAcquire
+ 2.35% pgbench [kernel.vmlinux] [k] fput
+ 2.34% pgbench [kernel.vmlinux] [k] unix_stream_recvmsg
+ 2.03% pgbench [kernel.vmlinux] [k] system_call
+ 2.02% pgbench pgbench [.] threadRun
+ 1.82% pgbench [kernel.vmlinux] [k] system_call_after_swapgs
+ 1.73% postgres plpgsql.so [.] plpgsql_exec_function
+ 1.56% postgres postgres [.] LWLockRelease
+ 1.33% pgbench [kernel.vmlinux] [k] sys_recvfrom
+ 1.24% postgres libc-2.19.so [.] _int_malloc
+ 1.16% pgbench [vdso] [.] 0x00000000000008c9
+ 1.06% pgbench [kernel.vmlinux] [k] __fget
+ 1.04% postgres libc-2.19.so [.] _int_free
+ 1.02% pgbench libpthread-2.19.so [.] __pthread_enable_asynccancel

8:

+ 8.41% swapper [kernel.vmlinux] [k] intel_idle
+ 4.35% pgbench pgbench [.] threadRun
+ 3.56% pgbench libpthread-2.19.so [.] __libc_recv
+ 2.58% pgbench [kernel.vmlinux] [k] unix_stream_recvmsg
+ 1.99% postgres plpgsql.so [.] plpgsql_exec_function
+ 1.98% postgres postgres [.] AllocSetAlloc
+ 1.94% pgbench [kernel.vmlinux] [k] sys_recvfrom
+ 1.72% postgres postgres [.] LWLockAcquire
+ 1.66% pgbench pgbench [.] doCustom
+ 1.59% pgbench [kernel.vmlinux] [k] system_call_after_swapgs
+ 1.58% pgbench [kernel.vmlinux] [k] system_call
+ 1.51% pgbench [vdso] [.] __vdso_gettimeofday
+ 1.50% pgbench [kernel.vmlinux] [k] __fget
+ 1.21% postgres libc-2.19.so [.] memset
+ 1.11% postgres libc-2.19.so [.] _int_malloc
+ 1.08% postgres postgres [.] GetSnapshotData

(intel_idle is executed on a cpu when it's idle. Not surprising that it
shows up prominently, especially when not all cores are busy). It's
interesting to see how the locking functions are less prominent in the
-c8 case, and how overhead of allocation and plpgsql_exec_function
rises.

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Andres Freund 2015-07-08 13:27:49 Re: BUG #13494: Postgresql database displays first column data on merging of two columns in the Select statement
Previous Message Andres Freund 2015-07-08 12:55:12 Re: BUG #13493: pl/pgsql doesn't scale with cpus (PG9.3, 9.4)