Re: Proposal: Deferred Replica Filtering for PostgreSQL Logical Replication

From: "Euler Taveira" <euler(at)eulerto(dot)com>
To: Dean <ds(dot)blue797(at)gmail(dot)com>, "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Proposal: Deferred Replica Filtering for PostgreSQL Logical Replication
Date: 2025-03-19 02:01:02
Message-ID: 501e39c2-317b-4a0f-93e8-61b1aceca8e0@app.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 18, 2025, at 9:56 PM, Dean wrote:
> Unfortunately, neither column lists nor row filters can provide the level of control I'm proposing. These revised examples might help illustrate the use case for DRF:

I'm afraid I didn't understand your proposal. Are you trying to use logical
replication with RLS enabled on subscriber?

> Alice, Bob, and Eve subscribe to changes on a `friend_requests` table. Row-level security ensures CRUD access based on user IDs.
> 1. Per-subscriber column control: Bob makes a change to the table. Alice should receive the entire record, while Eve should only receive the timestamp - no other columns. Why DRF is needed: Column lists are static and apply equally to all subscribers, meaning we can't distinguish Alice's subscription from Eve's.
> 2. Bob DELETEs a row from the table. Alice should see the DELETE event, while Eve should not even be aware of an event. Why DRF is needed: The deterministic nature of row filters makes them unsuitable for per-subscriber filtering based on session data.
>
> The goal of DRF is to allow per-subscriber variations in change broadcasts, enabling granular control over what data is sent to each subscriber based on their session context.

You misunderstood the logical replication architecture. The filtering is
applied *after* the WAL is decoded. See change_cb -- pgoutput_change().

You mentioned RLS but AFAICS it cannot replicate or do an initial
synchronization to a table if RLS is enabled.

See TargetPrivilegesCheck() -- worker.c.

/*
* We lack the infrastructure to honor RLS policies. It might be possible
* to add such infrastructure here, but tablesync workers lack it, too, so
* we don't bother. RLS does not ordinarily apply to TRUNCATE commands,
* but it seems dangerous to replicate a TRUNCATE and then refuse to
* replicate subsequent INSERTs, so we forbid all commands the same.
*/
if (check_enable_rls(relid, InvalidOid, false) == RLS_ENABLED)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("user \"%s\" cannot replicate into relation with row-level security enabled: \"%s\"",
GetUserNameFromId(GetUserId(), true),
RelationGetRelationName(rel))));

See LogicalRepSyncTableStart() -- tablesync.c.

/*
* COPY FROM does not honor RLS policies. That is not a problem for
* subscriptions owned by roles with BYPASSRLS privilege (or superuser,
* who has it implicitly), but other roles should not be able to
* circumvent RLS. Disallow logical replication into RLS enabled
* relations for such roles.
*/
if (check_enable_rls(RelationGetRelid(rel), InvalidOid, false) == RLS_ENABLED)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("user \"%s\" cannot replicate into relation with row-level security enabled: \"%s\"",
GetUserNameFromId(GetUserId(), true),
RelationGetRelationName(rel))));

The comments already point out directions. Feel free to write a proposal for
it.

--
Euler Taveira
EDB https://www.enterprisedb.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Hayato Kuroda (Fujitsu) 2025-03-19 02:07:23 RE: doc patch: wrong descriptions for dropping replication slots
Previous Message Zhijie Hou (Fujitsu) 2025-03-19 02:00:27 RE: Proposal: Deferred Replica Filtering for PostgreSQL Logical Replication