From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Mathieu Fenniak <mathieu(dot)fenniak(at)replicon(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Logical decoding CPU-bound w/ large number of tables |
Date: | 2017-05-06 00:59:09 |
Message-ID: | 18109.1494032349@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Mathieu Fenniak <mathieu(dot)fenniak(at)replicon(dot)com> writes:
> I'm attempting to use logical decoding with the streaming replication
> protocol to perform change-data-capture on PostgreSQL 9.5.4. I'm seeing
> the replication stream "stall" for long periods of time where the walsender
> process will be pinned at 100% CPU utilization, but no data is being sent
> to my client.
> I've performed a CPU sampling with the OSX `sample` tool based upon
> reproduction approach #1:
> https://gist.github.com/mfenniak/366d7ed19b2d804f41180572dc1600d8
> It appears that most of the time is spent in the
> RelfilenodeMapInvalidateCallback and CatalogCacheIdInvalidate cache
> invalidation callbacks, both of which appear to be invalidating caches
> based upon the cache value.
Hmm ... as for RelfilenodeMapInvalidateCallback, the lack of calls to
hash_search() from it in your trace says that it usually isn't doing
anything useful. All the time is being spent in hash_seq_search,
uselessly iterating over the hashtable. I'm inclined to think that
we need a smarter data structure there, maybe an independent hashtable
tracking the reverse map from relation OID to filenode map entry.
As for CatalogCacheIdInvalidate, I wonder how many of those cycles
are doing something useful, and how many are being wasted in the outer
loop that just iterates over the cache list. We could trivially get
rid of that outer search by using syscache.c's array, as in the
attached patch. It'd be interesting to see if this patch helps your
scenario #1. (Patch is against HEAD but seems to apply cleanly to 9.5)
Most likely, your scenario #2 is completely stuck on the
RelfilenodeMapInvalidateCallback issue, though it would be good
to get a trace to confirm that.
regards, tom lane
Attachment | Content-Type | Size |
---|---|---|
avoid-search-in-catcache-invalidate.patch | text/x-diff | 8.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2017-05-06 01:14:51 | Re: PG96 pg_restore connecting to PG95 causes ERROR: unrecognized configuration parameter "idle_in_transaction_session_timeout" |
Previous Message | Justin Pryzby | 2017-05-06 00:22:53 | PG96 pg_restore connecting to PG95 causes ERROR: unrecognized configuration parameter "idle_in_transaction_session_timeout" |