Selectively invalidate caches in pgoutput module

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: "'pgsql-hackers(at)lists(dot)postgresql(dot)org'" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, "ShlokKumar(dot)Kyal(at)fujitsu(dot)com" <ShlokKumar(dot)Kyal(at)fujitsu(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: Selectively invalidate caches in pgoutput module
Date: 2025-03-03 07:57:32
Message-ID: OSCPR01MB14966C09AA201EFFA706576A7F5C92@OSCPR01MB14966.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear hackers,

Hi, this is a fork thread from [1]. I want to propose a small optimization for
logical replication system.

Background
==========

When the ALTER PUBLICATION command is executed, all entries in RelationSyncCache
will be discarded anyway. This mechanism works well but is sometimes not efficient.
For example, when the ALTER PUBLICATION DROP TABLE is executed,
1) the specific entry in RelationSyncCache will be removed, and then
2) all entries will be discarded twice.

This happens because the pgoutput plugin registers both RelcacheCallback
(rel_sync_cache_relation_cb) and SyscacheCallback (publication_invalidation_cb,
rel_sync_cache_publication_cb). Then, when ALTER PUBLICATION ADD/SET/DROP is executed,
both the relation cache of added tables and the syscache of pg_publication_rel and
pg_publication are invalidated.
The callback for the relation cache will remove an entry from the hash table, and
syscache callbacks will look up all entries and invalidate them. However, AFAICS
does not need to invalidate all of them.

I grepped source codes and found this happens since the initial version.

Currently the effect of the behavior may not be large, but [1] may affect
significantly because it propagates invalidation messages to all in-progress
decoding transactions.

Patch overview
============

Based on the background, the patch avoids dropping all entries in RelationSyncCache
when ALTER PUBLICATION is executed. It removes sys cache callbacks for pg_publication_rel
and pg_publication_namespace and avoids discarding entries in sys cache for pg_publication.

Apart from the above, this patch also ensures that relcaches of publishing tables
are invalidated when ALTER PUBLICATION is executed. ADD/SET/DROP already has this
mechanism, but ALTER PUBLICATION OWNER TO and RENAME TO do not.
Regarding RENAME TO, now we are using a common function, but it is replaced with
RenamePublication() to do invalidations.

How do you think?

[1]: https://www.postgresql.org/message-id/de52b282-1166-1180-45a2-8d8917ca74c6@enterprisedb.com

Best regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment Content-Type Size
0001-Selectively-invalidate-cache-in-pgoutput.patch application/octet-stream 7.7 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2025-03-03 08:11:00 Re: Allow io_combine_limit up to 1MB
Previous Message Tender Wang 2025-03-03 07:57:31 Re: Anti join confusion