Re: Skipping schema changes in publication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "hametan(at)gmail(dot)com" <hametan(at)gmail(dot)com>
Subject: Re: Skipping schema changes in publication
Date: 2022-06-16 04:04:54
Message-ID: CAA4eK1JHpQ11=Hu3x2mP0wKu_gmoDmd95FgcoOmc9c5OpY2DAg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 14, 2022 at 9:10 AM houzj(dot)fnst(at)fujitsu(dot)com
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Wednesday, June 8, 2022 7:04 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Fri, Jun 3, 2022 at 3:37 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> > >
> > > Thanks for the comments, the attached v8 patch has the changes for the
> > same.
> > >
> >
> > AFAICS, the summary of this proposal is that we want to support
> > exclude of certain objects from publication with two kinds of
> > variants. The first variant is to add support to exclude specific
> > tables from ALL TABLES PUBLICATION. Without this feature, users need
> > to manually add all tables for a database even when she wants to avoid
> > only a handful of tables from the database say because they contain
> > sensitive information or are not required. We have seen that other
> > database like MySQL also provides similar feature [1] (See
> > REPLICATE_WILD_IGNORE_TABLE). The proposed syntax for this is as
> > follows:
> >
> > CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2;
> > or
> > ALTER PUBLICATION pub1 ADD ALL TABLES EXCEPT TABLE t1,t2;
> >
> > This will allow us to publish all the tables in the current database
> > except t1 and t2. Now, I see that pg_dump has a similar option
> > provided by switch --exclude-table but that allows tables matching
> > patterns which is not the case here. I am not sure if we need a
> > similar variant here.
> >
> > Then users will be allowed to reset the publication by:
> > ALTER PUBLICATION pub1 RESET;
> >
> > This will reset the publication to the default state which includes
> > resetting the publication parameters, setting the ALL TABLES flag to
> > false, and dropping the relations and schemas that are associated with
> > the publication. I don't know if we want to go further with allowing
> > to RESET specific parameters and if so which parameters and what would
> > its syntax be?
> >
> > The second variant is to add support to exclude certain columns of a
> > table while publishing a particular table. Currently, users need to
> > list all required columns' names even if they don't want to hide most
> > of the columns in the table (for example Create Publication pub For
> > Table t1 (c1, c2)). Consider user doesn't want to publish the 'salary'
> > or other sensitive information of executives/employees but would like
> > to publish all other columns. I feel in such cases it will be a lot of
> > work for the user especially when the table has many columns. I see
> > that Oracle has a similar feature [2]. I think without this it will be
> > difficult for users to use this feature in some cases. The patch for
> > this is not proposed but I would imagine syntax for it to be something
> > like "Create Publication pub For Table t1 Except (c3)" and similar
> > variants for Alter Publication.
>
> I think the feature to exclude certain columns of a table would be useful.
>
> In some production scenarios, we usually do not want to replicate
> sensitive fields(column) in the table. Although we already can achieve
> this by specify all replicated columns in the list[1], but that seems a
> hard work when the table has hundreds of columns.
>
> [1]
> CREATE TABLE test(a int, b int, c int,..., sensitive text);
> CRAETE PUBLICATION pub FOR TABLE test(a,b,c,...);
>
> In addition, it's not easy to maintain the column list like above. Because
> we sometimes need to add new fields or delete fields due to business
> needs. Every time we add a column(or delete a column in column list), we
> need to update the column list.
>
> If we support Except:
> CRAETE PUBLICATION pub FOR TABLE test EXCEPT (sensitive);
>
> We don't need to update the column list in most cases.
>

Right, this is a valid point and I think it makes sense for me to
support such a feature for column list and also to exclude a
particular table(s) from the ALL TABLES publication.

Peter E., Euler, and others, do you have any objections to supporting
the above-mentioned two cases?

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-06-16 04:23:17 Re: bogus: logical replication rows/cols combinations
Previous Message Michael Paquier 2022-06-16 03:51:03 Re: Extending USING [heap | mytam | yourtam] grammar and behavior