From: | Dean <ds(dot)blue797(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Proposal: Deferred Replica Filtering for PostgreSQL Logical Replication |
Date: | 2025-03-19 00:56:04 |
Message-ID: | CALWmXtuyvdL5zyYKgnszEVjX-Ru7jmGpKhn8zobXRbpoWRFFSg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Unfortunately, neither column lists nor row filters can provide the level
of control I'm proposing. These revised examples might help illustrate the
use case for DRF:
Alice, Bob, and Eve subscribe to changes on a `friend_requests` table.
Row-level security ensures CRUD access based on user IDs.
1. Per-subscriber column control: Bob makes a change to the table. Alice
should receive the entire record, while Eve should only receive the
timestamp - no other columns. Why DRF is needed: Column lists are static
and apply equally to all subscribers, meaning we can't distinguish Alice's
subscription from Eve's.
2. Bob DELETEs a row from the table. Alice should see the DELETE event,
while Eve should not even be aware of an event. Why DRF is needed: The
deterministic nature of row filters makes them unsuitable for
per-subscriber filtering based on session data.
The goal of DRF is to allow per-subscriber variations in change broadcasts,
enabling granular control over what data is sent to each subscriber based
on their session context.
Best,
Dean S
On Mon, Mar 17, 2025 at 4:32 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Sun, Mar 16, 2025 at 12:59 AM Dean <ds(dot)blue797(at)gmail(dot)com> wrote:
> >
> > I'd like to propose an enhancement to PostgreSQL's logical replication
> system: Deferred Replica Filtering (DRF). The goal of this feature is to
> provide more granular control over which rows are replicated by applying
> publication filters after the WAL decoding process, before sending data to
> subscribers.
> >
> > Currently, PostgreSQL's logical replication filters apply
> deterministically. Deferred filtering, however, operates after the WAL has
> been decoded, giving it access to the complete row data and making
> filtering decisions based on mutable values. Additionally, record columns
> may be omitted by the filter.
> >
> > This opens up several possibilities for granular control. Consider the
> following examples:
> > Alice and Bob subscribe to changes on a table with RLS enabled, allowing
> CRUD operations based on user's IDs.
> > 1. Alice needs to know the timestamp at which Bob updated the table.
> With DRF, we can omit all columns except for the timestamp.
> > 2. Bob wants to track DELETEs on the table. Without DRF, Bob can see all
> columns on any deleted row, potentially exposing complete records he
> shouldn't be authorized to view. DRF can filter these rows out.
> >
> > Deferred replica filtering allows for session-specific, per-row, and
> per-column filtering - features currently not supported by existing
> replication filters, enhancing security and data privacy.
> >
>
> We provide column lists [1] and row filters [2]. Doesn't that suffice
> the need, if not, kindly let us know what exactly you need with some
> examples.
>
> [1] -
> https://www.postgresql.org/docs/devel/logical-replication-col-lists.html
> [2] -
> https://www.postgresql.org/docs/devel/logical-replication-row-filter.html
>
> --
> With Regards,
> Amit Kapila.
>
From | Date | Subject | |
---|---|---|---|
Next Message | Melanie Plageman | 2025-03-19 01:00:17 | Re: AIO v2.5 |
Previous Message | Nathan Bossart | 2025-03-19 00:41:17 | Re: pgsql: aio: Infrastructure for io_method=worker |