Re: Logical Replication Custom Column Expression

From: Stavros Koureas <koureasstavros(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Logical Replication Custom Column Expression
Date: 2022-11-25 10:43:46
Message-ID: CA+O1jk7P=NXkS9ZNHM14gi4kChb=vUyBp9hPLf-4JUG80cW_iA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Yes, if the property is on the subscription side then it should be applied
for all the tables that the connected publication is exposing.
So if the property is enabled you should be sure that this origin column
exists to all of the tables that the publication is exposing...

Sure this is the complete idea, that the subscriber should match the PK of
origin, <previous_pkey>
As the subscription table will contain same key values from different
origins, for example:

*For publisher1 database **table*
id pk integer | value character varying
1 | testA1
2 | testA2

*For publisher2 database **table*
id pk integer | value character varying
1 | testB1
2 | testB2

*For subscriber database table*
origin *pk *character varying | id *pk *integer | value character varying
publisher1 | 1 | testA1
publisher1 | 2 | testA2
publisher2 | 1 | testB1
publisher2 | 2 | testB2

All statements INSERT, UPDATE, DELETE should always include the predicate
of the origin.

Στις Παρ 25 Νοε 2022 στις 12:21 μ.μ., ο/η Ashutosh Bapat <
ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> έγραψε:

> On Wed, Nov 23, 2022 at 4:54 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > On Wed, Nov 23, 2022 at 7:38 AM Stavros Koureas
> > <koureasstavros(at)gmail(dot)com> wrote:
> > >
> > > Reading more carefully what you described, I think you are interested
> in getting something you call origin from publishers, probably some
> metadata from the publications.
> > >
> > > This identifier in those metadata maybe does not have business value
> on the reporting side. The idea is to use a value which has specific
> meaning to the user at the end.
> > >
> > > For example assigning 1 for tenant 1, 2 for tenant 2 and so one, at
> the end based on a dimension table which holds this mapping the user would
> be able to filter the data. So programmatically the user can set the id
> value of the column plus creating the mapping table from an application
> let’s say and be able to distinguish the data.
> > >
> > > In addition this column should have the ability to be part of the
> primary key on the subscription table in order to not conflict with lines
> from other tenants having the same keys.
> > >
> > >
> >
> > I was wondering if a simpler syntax solution might also work here.
> >
> > Imagine another SUBSCRIPTION parameter that indicates to write the
> > *name* of the subscription to some pre-defined table column:
> > e.g. CREATE SUBSCRIPTION subname FOR PUBLICATION pub_tenant_1
> > CONNECTION '...' WITH (subscription_column);
> >
> > Logical Replication already allows the subscriber table to have extra
> > columns, so you just need to manually create the extra 'subscription'
> > column up-front.
> >
> > Then...
> >
> > ~~
> >
> > On Publisher:
> >
> > test_pub=# CREATE TABLE tab(id int primary key, description varchar);
> > CREATE TABLE
> >
> > test_pub=# INSERT INTO tab VALUES (1,'one'),(2,'two'),(3,'three');
> > INSERT 0 3
> >
> > test_pub=# CREATE PUBLICATION tenant1 FOR ALL TABLES;
> > CREATE PUBLICATION
> >
> > ~~
> >
> > On Subscriber:
> >
> > test_sub=# CREATE TABLE tab(id int, description varchar, subscription
> varchar);
> > CREATE TABLE
> >
> > test_sub=# CREATE SUBSCRIPTION sub_tenant1 CONNECTION 'host=localhost
> > dbname=test_pub' PUBLICATION tenant1 WITH (subscription_column);
> > CREATE SUBSCRIPTION
> >
> > test_sub=# SELECT * FROM tab;
> > id | description | subscription
> > ----+-------------+--------------
> > 1 | one | sub_tenant1
> > 2 | two | sub_tenant1
> > 3 | three | sub_tenant1
> > (3 rows)
> >
> > ~~
> >
> Thanks for the example. This is more concrete than just verbal description.
>
> In this example, do all the tables that a subscription subscribes to
> need that additional column or somehow the pglogical receiver will
> figure out which tables have that column and populate rows
> accordingly?
>
> My further fear is that the subscriber will also need to match the
> subscription column along with the rest of PK so as not to update rows
> from other subscriptions.
> --
> Best Wishes,
> Ashutosh Bapat
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2022-11-25 11:00:00 Re: Support logical replication of DDLs
Previous Message Ashutosh Bapat 2022-11-25 10:34:25 Re: Avoid streaming the transaction which are skipped (in corner cases)