Re: pgsql: Avoid invalidating all RelationSyncCache entries on publication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <akapila(at)postgresql(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgsql: Avoid invalidating all RelationSyncCache entries on publication
Date: 2025-03-18 06:30:04
Message-ID: CAA4eK1JBuaC9R7r22Da+56L-Gi3Xm3TTbutkJwe6i0847fhx8Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Mon, Mar 17, 2025 at 9:35 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Thu, Mar 13, 2025 at 12:00 AM Amit Kapila <akapila(at)postgresql(dot)org> wrote:
> > Avoid invalidating all RelationSyncCache entries on publication rename.
> >
> > On Publication rename, we need to only invalidate the RelationSyncCache
> > entries corresponding to relations that are part of the publication being
> > renamed.
> >
> > As part of this patch, we introduce a new invalidation message to
> > invalidate the cache maintained by the logical decoding output plugin. We
> > can't use existing relcache invalidation for this purpose, as that would
> > unnecessarily cause relcache invalidations in other backends.
>
> This seems like too much infrastructure for a niche optimization. If
> there are plans to use this new invalidation message type to optimize
> a bunch of other cases, then maybe it's worth it, but adding this just
> to cover the presumably-rare case of ALTER PUBLICATION .. RENAME
> doesn't seem worth it to me.
>

This commit will be helpful for many other existing commands like:
CREATE PUBLICATION ...
ALTER PUBLICATION name SET (publication_parameter)
ALTER PUBLICATION name OWNER ...
DROP PUBLICATION ...

Before this, commit any publication command that modified
pg_publication catalog use to invalidate the entire RelationSyncCache.
The code that led to entire RelationSyncCache invalidation was removed
in this commit:
publication_invalidation_cb(Datum arg, int cacheid, uint32 hashvalue)
{
publications_valid = false;
-
- /*
- * Also invalidate per-relation cache so that next time the filtering info
- * is checked it will be updated with the new publication settings.
- */
- rel_sync_cache_publication_cb(arg, cacheid, hashvalue);
}

We can't remove this without having a solution for ALTER PUBLICATION
.. RENAME command to invalidate the specific RelationSyncCache
entries. As mentioned in the commit message, the other idea was to
register a relcache invalidation for all relations that are part of a
publication which we felt could be harmful to other backends that are
not even involved in decoding. Now, the ALTER PUBLICATION name OWNER
command doesn't need to use this new invalidation at this stage
because it doesn't impact the RelationSyncCache entries but it
benefits from not requiring to invalidate the entire cache. Similarly,
the other commands mentioned above uses relcache invalidation for this
purpose but the difference is that the other commands do need relcache
invalidation for the purpose of correctness whereas ALTER PUBLICATION
.. RENAME command doesn't need it.

In the future, we can use this invalidation in existing publication
commands like altering a publication where we only publish 'INSERT'
(and or 'TRUNCATE') and change other publication properties like
'publish_via_partition_root', 'publish_generated_columns'. Similarly,
I think we can even use new invalidation for SET/ADD/DROP variants of
Alter PUBLICATION where the publication publishes only 'INSERT' and or
'TRUNCATE'. Before this commit, we didn't have a better way so we used
relcache invalidations for these cases as well. Also, any new
publication property or command should prefer to use this new
invalidation instead of using a relcache invalidation unless required
the same for correctness.

The impact of not doing any solution for this problem (aka removing
the above code in publication_invalidation_cb) could be magnified in
future commits where we are trying to solve a long-standing data loss
issue in logical decoding/replication via distributing invalidations
to all in-progress transactions [1]. You can read the commit message
to understand that problem and solution.

I think I should have mentioned the future use and other improvements
of this work in the commit message.

[1] - https://www.postgresql.org/message-id/OSCPR01MB14966DDB92FA7DA8FA8658216F5DE2%40OSCPR01MB14966.jpnprd01.prod.outlook.com

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Amit Kapila 2025-03-18 08:47:13 pgsql: Use correct variable name in publicationcmds.c.
Previous Message Masahiko Sawada 2025-03-18 04:35:43 Re: pgsql: pg_upgrade: Preserve default char signedness value from old clus

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2025-03-18 06:34:01 Re: maintenance_work_mem = 64kB doesn't work for vacuum
Previous Message Shubham Khanna 2025-03-18 06:20:29 Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility.