Re: Logical replication CPU-bound with TRUNCATE/DROP/CREATE many tables

From: Keisuke Kuroda <keisuke(dot)kuroda(dot)3862(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Logical replication CPU-bound with TRUNCATE/DROP/CREATE many tables
Date: 2020-10-02 06:56:43
Message-ID: CANDwgg+RD3KohQmBGWTK14nOuTKdFD=QfOZSfmg2j0GC8zKs8g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Dilip, Amit,

> > 5. Can you please once repeat the performance test done by Keisuke-San
> > to see if you have similar observations? Additionally, see if you are
> > also seeing the inconsistency related to the Truncate message reported
> > by him and if so why?
> >
>
> Okay, I will set up and do this test early next week. Keisuke-San,
> can you send me your complete test script?

Yes, I've attached a test script(test_pg_recvlogical.sh)

Sorry, the issue with TRUNCATE not outputting was due to a build miss
on my part.
Even before the patch, TRUNCATE decodes and outputting correctly.
So, please check the performance only.

I have tested it again and will share the results with you.

Also, the argument of palloc was still MemoryContextAlloc,
which prevented me from applying the patch, so I've only fixed that part.

# test script

Please set PGHOME and CLUSTER_PUB before run.

sh test_pg_recvlogical.sh

# perf command

perf record --call-graph dwarf -p [walsender pid]
perf report -i perf.data --no-children

# before patch

decode + invalidation = 222s

2020-10-02 14:55:50 BEGIN 509
2020-10-02 14:59:42 table nsp_001.tbl_001, nsp_001.part_0001 ...
nsp_001.part_0999, nsp_001.part_1000: TRUNCATE: (no-flags)
2020-10-02 14:59:42 COMMIT 509 (at 2020-10-02 14:55:50.349219+09)

Samples: 229K of event 'cpu-clock:uhH', Event count (approx.): 57347250000
Overhead Command Shared Object Symbol
- 96.07% postgres postgres [.] hash_seq_search
hash_seq_search
- RelfilenodeMapInvalidateCallback
- LocalExecuteInvalidationMessage
ReorderBufferExecuteInvalidations
ReorderBufferProcessTXN
ReorderBufferCommit
DecodeCommit
DecodeXactOp
LogicalDecodingProcessRecord
XLogSendLogical
WalSndLoop
StartLogicalReplication
exec_replication_command
PostgresMain
BackendRun
BackendStartup
ServerLoop
PostmasterMain
main
__libc_start_main
_start
0.74% postgres postgres [.] LocalExecuteInvalidationMessage
0.47% postgres postgres [.] hash_bytes_uint32
0.46% postgres postgres [.] hash_search_with_hash_value
0.37% postgres postgres [.] RelfilenodeMapInvalidateCallback
0.31% postgres postgres [.] CatCacheInvalidate
0.22% postgres postgres [.] uint32_hash
0.21% postgres postgres [.] PlanCacheRelCallback
0.20% postgres postgres [.] hash_seq_init
0.17% postgres postgres [.] ReorderBufferExecuteInvalidations

# after patch

decode + invalidation = 1s or less

2020-10-02 15:20:08 BEGIN 509
2020-10-02 15:20:08 table nsp_001.tbl_001, nsp_001.part_0001 ...
nsp_001.part_0999, nsp_001.part_1000: TRUNCATE: (no-flags)
2020-10-02 15:20:08 COMMIT 509 (at 2020-10-02 15:20:08.704503+09)

Samples: 310 of event 'cpu-clock:uhH', Event count (approx.): 77500000
Overhead Command Shared Object Symbol
- 49.35% postgres postgres [.] hash_seq_search
hash_seq_search
RelfilenodeMapInvalidateCallback
LocalExecuteInvalidationMessage
ReorderBufferExecuteInvalidations
ReorderBufferProcessTXN
ReorderBufferCommit
DecodeCommit
DecodeXactOp
LogicalDecodingProcessRecord
XLogSendLogical
WalSndLoop
StartLogicalReplication
exec_replication_command
PostgresMain
BackendRun
BackendStartup
ServerLoop
PostmasterMain
main
__libc_start_main
_start
2.58% postgres libc-2.17.so [.] __memset_sse2
2.58% postgres libpthread-2.17.so [.] __libc_recv
1.61% postgres libc-2.17.so [.] __memcpy_ssse3_back
1.61% postgres libpthread-2.17.so [.] __errno_location
1.61% postgres postgres [.] AllocSetAlloc
1.61% postgres postgres [.] DecodeXLogRecord
1.29% postgres postgres [.] AllocSetFree
1.29% postgres postgres [.] hash_bytes_uint32
0.97% postgres [vdso] [.] __vdso_gettimeofday
0.97% postgres postgres [.] LocalExecuteInvalidationMessage

Best Regards,
--
Keisuke Kuroda
NTT Software Innovation Center
keisuke(dot)kuroda(dot)3862(at)gmail(dot)com

Attachment Content-Type Size
test_pg_recvlogical.sh text/x-sh 3.5 KB
v4-0001-Collect-command-invalidation-in-form-of-changes.patch application/octet-stream 9.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-10-02 07:28:14 Re: please update ps display for recovery checkpoint
Previous Message Kyotaro Horiguchi 2020-10-02 06:39:25 Re: Asynchronous Append on postgres_fdw nodes.