From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Cc: | Stephen Frost <sfrost(at)snowman(dot)net>, Robbie Harwood <rharwood(at)redhat(dot)com> |
Subject: | libpq contention due to gss even when not using gss |
Date: | 2024-06-10 18:12:12 |
Message-ID: | 20240610181212.auytluwmbfl7lb5n@awork3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
To investigate a report of both postgres and pgbouncer having issues when a
lot of new connections aree established, I used pgbench -C. Oddly, on an
early attempt, the bottleneck wasn't postgres+pgbouncer, it was pgbench. But
only when using TCP, not with unix sockets.
c=40;pgbench -C -n -c$c -j$c -T5 -f <(echo 'select 1') 'port=6432 host=127.0.0.1 user=test dbname=postgres password=fake'
host=127.0.0.1: 16465
host=127.0.0.1,gssencmode=disable 20860
host=/tmp: 49286
Note that the server does *not* support gss, yet gss has a substantial
performance impact.
Obviously the connection rates here absurdly high and outside of badly written
applications likely never practically relevant. However, the number of cores
in systems are going up, and this quite possibly will become relevant in more
realistic scenarios (lock contention kicks in earlier the more cores you
have).
And it doesn't seem great that something as rarely used as gss introduces
overhead to very common paths.
Here's a bottom-up profile:
- 32.10% pgbench [kernel.kallsyms] [k] queued_spin_lock_slowpath
- 32.09% queued_spin_lock_slowpath
- 16.15% futex_wake
do_futex
__x64_sys_futex
do_syscall_64
- entry_SYSCALL_64_after_hwframe
- 16.15% __GI___lll_lock_wake
- __GI___pthread_mutex_unlock_usercnt
- 5.12% gssint_select_mech_type
- 4.36% gss_inquire_attrs_for_mech
- 2.85% gss_indicate_mechs
- gss_indicate_mechs_by_attrs
- 1.58% gss_acquire_cred_from
gss_acquire_cred
pg_GSS_have_cred_cache
select_next_encryption_method (inlined)
init_allowed_encryption_methods (inlined)
PQconnectPoll
pqConnectDBStart (inlined)
PQconnectStartParams
PQconnectdbParams
doConnect
And a bottom-up profile:
- 32.10% pgbench [kernel.kallsyms] [k] queued_spin_lock_slowpath
- 32.09% queued_spin_lock_slowpath
- 16.15% futex_wake
do_futex
__x64_sys_futex
do_syscall_64
- entry_SYSCALL_64_after_hwframe
- 16.15% __GI___lll_lock_wake
- __GI___pthread_mutex_unlock_usercnt
- 5.12% gssint_select_mech_type
- 4.36% gss_inquire_attrs_for_mech
- 2.85% gss_indicate_mechs
- gss_indicate_mechs_by_attrs
- 1.58% gss_acquire_cred_from
gss_acquire_cred
pg_GSS_have_cred_cache
select_next_encryption_method (inlined)
init_allowed_encryption_methods (inlined)
PQconnectPoll
pqConnectDBStart (inlined)
PQconnectStartParams
PQconnectdbParams
doConnect
Clearly the contention originates outside of our code, but is triggered by
doing pg_GSS_have_cred_cache() every time a connection is established.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2024-06-10 18:30:31 | Re: Remove dependence on integer wrapping |
Previous Message | Bertrand Drouvot | 2024-06-10 17:48:22 | Re: Track the amount of time waiting due to cost_delay |