From: | Кирилл Решке <reshkekirill(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Slow standby snapshot |
Date: | 2021-05-20 08:52:47 |
Message-ID: | CALdSSPgahNUD_=pB_j=1zSnDBaiOtqVfzo8Ejt5J_k7qZiU1Tw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
I recently ran into a problem in one of our production postgresql cluster.
I had noticed lock contention on procarray lock on standby, which causes
WAL replay lag growth.
To reproduce this, you can do the following:
1) set max_connections to big number, like 100000
2) begin a transaction on primary
3) start pgbench workload on primary and on standby
After a while it will be possible to see KnownAssignedXidsGetAndSetXmin in
perf top consuming abount 75 % of CPU.
%%
PerfTop: 1060 irqs/sec kernel: 0.0% exact: 0.0% [4000Hz cycles:u],
(target_pid: 273361)
-------------------------------------------------------------------------------
73.92% postgres [.] KnownAssignedXidsGetAndSetXmin
1.40% postgres [.] base_yyparse
0.96% postgres [.] LWLockAttemptLock
0.84% postgres [.] hash_search_with_hash_value
0.84% postgres [.] AtEOXact_GUC
0.72% postgres [.] ResetAllOptions
0.70% postgres [.] AllocSetAlloc
0.60% postgres [.] _bt_compare
0.55% postgres [.] core_yylex
0.42% libc-2.27.so [.] __strlen_avx2
0.23% postgres [.] LWLockRelease
0.19% postgres [.] MemoryContextAllocZeroAligned
0.18% postgres [.] expression_tree_walker.part.3
0.18% libc-2.27.so [.] __memmove_avx_unaligned_erms
0.17% postgres [.] PostgresMain
0.17% postgres [.] palloc
0.17% libc-2.27.so [.] _int_malloc
0.17% postgres [.] set_config_option
0.17% postgres [.] ScanKeywordLookup
0.16% postgres [.] _bt_checkpage
%%
We have tried to fix this by using BitMapSet instead of boolean array
KnownAssignedXidsValid, but this does not help too much.
Instead, using a doubly linked list helps a little more, we got +1000 tps
on pgbench workload with patched postgresql. The general idea of this patch
is that, instead of memorizing which elements in KnownAssignedXids are
valid, lets maintain a doubly linked list of them. This solution will work
in exactly the same way, except that taking a snapshot on the replica is
now O(running transaction) instead of O(head - tail) which is significantly
faster under some workloads. The patch helps to reduce CPU usage of
KnownAssignedXidsGetAndSetXmin to ~48% instead of ~74%, but does eliminate
it from perf top.
The problem is better reproduced on PG13 since PG14 has some snapshot
optimization.
Thanks!
Best regards, reshke
From | Date | Subject | |
---|---|---|---|
Next Message | osumi.takamichi@fujitsu.com | 2021-05-20 08:58:59 | RE: Forget close an open relation in ReorderBufferProcessTXN() |
Previous Message | Amit Langote | 2021-05-20 08:49:20 | Re: Skip partition tuple routing with constant partition key |