Re: Column Filtering in Logical Replication

From: Rahila Syed <rahilasyed90(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Column Filtering in Logical Replication
Date: 2021-08-12 03:10:00
Message-ID: CAH2L28vtVkXj9cjY-SPFVfaHP0gKVLH77Rd2sz9LUJ=e2Pwchg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Amit,

Thanks for your review.

> Can you please explain why you have the restriction for including
> replica identity columns and do we want to put a similar restriction
> for the primary key? As far as I understand, if we allow default
> values on subscribers for replica identity, then probably updates,
> deletes won't work as they need to use replica identity (or PK) to
> search the required tuple. If so, shouldn't we add this restriction
> only when a publication has been defined for one of these (Update,
> Delete) actions?
>

Yes, like you mentioned they are needed for Updates and Deletes to work.
The restriction for including replica identity columns in column filters
exists because
In case the replica identity column values did not change, the old row
replica identity columns
are not sent to the subscriber, thus we would need new replica identity
columns
to be sent to identify the row that is to be Updated or Deleted.
I haven't tested if it would break Insert as well though. I will update
the patch accordingly.

> Another point is what if someone drops the column used in one of the
> publications? Do we want to drop the entire relation from publication
> or just remove the column filter or something else?
>
>
Thanks for pointing this out. Currently, this is not handled in the patch.
I think dropping the column from the filter would make sense on the lines
of the table being dropped from publication, in case of drop table.

> Do we want to consider that the columns specified in the filter must
> not have NOT NULL constraint? Because, otherwise, the subscriber will
> error out inserting such rows?
>
> I think you mean columns *not* specified in the filter must not have NOT
NULL constraint
on the subscriber, as this will break during insert, as it will try to
insert NULL for columns
not sent by the publisher.
I will look into fixing this. Probably this won't be a problem in
case the column is auto generated or contains a default value.

> Minor comments:
> ================
> pq_sendbyte(out, flags);
> -
> /* attribute name */
> pq_sendstring(out, NameStr(att->attname));
>
> @@ -953,6 +1000,7 @@ logicalrep_write_attrs(StringInfo out, Relation rel)
>
> /* attribute mode */
> pq_sendint32(out, att->atttypmod);
> +
> }
>
> bms_free(idattrs);
> diff --git a/src/backend/replication/logical/relation.c
> b/src/backend/replication/logical/relation.c
> index c37e2a7e29..d7a7b00841 100644
> --- a/src/backend/replication/logical/relation.c
> +++ b/src/backend/replication/logical/relation.c
> @@ -354,7 +354,6 @@ logicalrep_rel_open(LogicalRepRelId remoteid,
> LOCKMODE lockmode)
>
> attnum = logicalrep_rel_att_by_name(remoterel,
> NameStr(attr->attname));
> -
> entry->attrmap->attnums[i] = attnum;
>
> There are quite a few places in the patch that contains spurious line
> additions or removals.
>
>
Thank you for your comments, I will fix these.

Thank you,
Rahila Syed

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2021-08-12 03:39:31 Re: Next Steps with Hash Indexes
Previous Message Wu Haotian 2021-08-12 02:53:36 Re: Add option --drop-cascade for pg_dump/restore