Re: Column Filtering in Logical Replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Rahila Syed <rahilasyed90(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Column Filtering in Logical Replication
Date: 2021-08-12 08:29:27
Message-ID: CAA4eK1JsGjnhLU3hZ+kKDesHDsJDTtTG5USsd+zLdR1MDUXg+w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 12, 2021 at 8:40 AM Rahila Syed <rahilasyed90(at)gmail(dot)com> wrote:
>
>>
>> Can you please explain why you have the restriction for including
>> replica identity columns and do we want to put a similar restriction
>> for the primary key? As far as I understand, if we allow default
>> values on subscribers for replica identity, then probably updates,
>> deletes won't work as they need to use replica identity (or PK) to
>> search the required tuple. If so, shouldn't we add this restriction
>> only when a publication has been defined for one of these (Update,
>> Delete) actions?
>
>
> Yes, like you mentioned they are needed for Updates and Deletes to work.
> The restriction for including replica identity columns in column filters exists because
> In case the replica identity column values did not change, the old row replica identity columns
> are not sent to the subscriber, thus we would need new replica identity columns
> to be sent to identify the row that is to be Updated or Deleted.
> I haven't tested if it would break Insert as well though. I will update the patch accordingly.
>

Okay, but then we also need to ensure that the user shouldn't be
allowed to enable the 'update' or 'delete' for a publication that
contains some filter that doesn't have replica identity columns.

>>
>> Another point is what if someone drops the column used in one of the
>> publications? Do we want to drop the entire relation from publication
>> or just remove the column filter or something else?
>>
>
> Thanks for pointing this out. Currently, this is not handled in the patch.
> I think dropping the column from the filter would make sense on the lines
> of the table being dropped from publication, in case of drop table.
>

I think it would be tricky if you want to remove the column from the
filter because you need to recompute the entire filter and update it
again. Also, you might need to do this for all the publications that
have a particular column in their filter clause. It might be easier to
drop the entire filter but you can check if it is easier another way
than it is good.

>>
>> Do we want to consider that the columns specified in the filter must
>> not have NOT NULL constraint? Because, otherwise, the subscriber will
>> error out inserting such rows?
>>
> I think you mean columns *not* specified in the filter must not have NOT NULL constraint
> on the subscriber, as this will break during insert, as it will try to insert NULL for columns
> not sent by the publisher.
>

Right.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2021-08-12 08:39:59 Grammar fix in hash index README
Previous Message Andres Freund 2021-08-12 08:22:37 Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o