Logical replication - DDL sub transactions for script executed in single transaction?

From: Aleš Zelený <zeleny(dot)ales(at)gmail(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Logical replication - DDL sub transactions for script executed in single transaction?
Date: 2019-03-22 09:48:35
Message-ID: CAODqTUbmOzWZxGxW8M-hft-gs_6doWSg+Uix5eMcA6NSPdygfw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello,

I've learned that logical replication might have performance problem if
there are lot of sub transactions within transaction (at least because it
enforces spill files in pg_replslot and if there are many - like 80mio,
EXT4 did not perform well - in my case it was caused bu misuse of
exception handling in trigger function) and once we get rid of this usage
pattern logical replication works like a charm.
Until we have applied schema migration patch creating 700+ tables by
flyway, therefore whole file was applied in single transaction (like psql
-1 -f xxxx) and it effectively blocked our replication for 22 hours. No
failures, wal sender processed did not consumed huge amount of memory (what
happened in past wiht 80mi sub-trans caused by trigger), all wal senders
did IO very rarely, most of the time they dwell on CPU.

The spill files were not so big:
# du -sh xid-924361123-lsn-FBC-C7000000.snap
224K xid-924361123-lsn-FBC-C7000000.snap

It takes about 1 minute (all the time 100% CPU used by was sender) to move
to following file.

Version, OS:
PostgreSQL 10.6 (Ubuntu 10.6-1.pgdg16.04+1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609, 64-bit

Perf:
Overhead Command Shared Object Symbol
57.79% postgres postgres [.] hash_seq_search
14.70% postgres postgres [.] hash_search_with_hash_value
5.93% postgres postgres [.] LocalExecuteInvalidationMessage
3.60% postgres postgres [.] RelfilenodeMapInvalidateCallback
2.99% postgres postgres [.] hash_uint32
2.50% postgres postgres [.] hash_search
1.91% postgres postgres [.] CatCacheInvalidate
1.83% postgres postgres [.] CallSyscacheCallbacks
1.37% postgres postgres [.] hash_seq_init
1.24% postgres postgres [.] hash_seq_term
1.02% postgres postgres [.] RelationCacheInvalidateEntry
0.97% postgres postgres [.] ReorderBufferCommit
0.97% postgres pgoutput.so [.] rel_sync_cache_relation_cb
0.78% postgres postgres [.] uint32_hash
0.73% postgres postgres [.] PlanCacheRelCallback
0.60% postgres postgres [.] InvalidateCatalogSnapshot
0.58% postgres postgres [.] SysCacheInvalidate
0.35% postgres postgres [.] GetCurrentTransactionNestLevel
0.13% postgres pgoutput.so [.] _init
0.00% postgres [kernel.kallsyms] [k] __do_softirq
0.00% postgres [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
0.00% postgres postgres [.] ResourceOwnerEnlargeRelationRefs

My suspect is that psql (of flyway) while processing a file with many
create table and grant statements in single transaction mode might create
sub transactions, but I haven't found it in docs.

Does you have similar experience? What is root cause for this issue?

Workaround is clear - do not create that large amount of DDL in single
transaction, but it would be nice to know root cause to eliminate
scripts/usage patterns causing this issue generally.

On the other hand, following transaction (100GB of spill files) , almost
150 mio rows deleted (and archived by statement trigger) behaves expected
way - wal sender process was reading from disk (something like 130MB/s) and
after ~20 minutes replication was in sync, all spill files were removed, so
transaction changes volume itself does not seems to be any issue.

Thanks for any advice!

Kind regards Ales Zeleny
P.S.: PostgreSQL log facility is superb, having DDL statements logged and
XID in log line prefix I was able to identify the issue source in this case.

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David Rowley 2019-03-22 10:15:30 Re: Column lookup in a row performance
Previous Message Ali Mumcu 2019-03-22 09:23:09 Re: PostgreSQL-11 Streaming Replication Slave Recovering