Re: Support logical replication of DDLs

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Runqi Tian <runqidev(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, li jie <ggysxcq(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, rajesh singarapu <rajesh(dot)rs0541(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Zheng Li <zhengli10(at)gmail(dot)com>
Subject: Re: Support logical replication of DDLs
Date: 2023-03-27 12:07:46
Message-ID: CAA4eK1++Y7a2SQq55DXT6neghZgj3j+pQ74=8zfT3k8Tkdj0ZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Mon, Mar 27, 2023 at 12:07 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Mar 27, 2023 at 2:52 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >
>
> > I suggest taking a couple of steps back from the minutiae of the
> > patch, and spending some hard effort thinking about how the thing
> > would be controlled in a useful fashion (that is, a real design for
> > the filtering that was mentioned at the very outset), and about the
> > security issues, and about how we could get to a committable patch.
> >
>
> Agreed. I'll try to summarize the discussion we have till now on this
> and share my thoughts on the same in a separate email.
>

The idea to control what could be replicated is to introduce a new
publication option 'ddl' along with current options 'publish' and
'publish_via_partition_root'. The values of this new option could be
'table', 'function', 'all', etc. Here 'all' enables the replication of
all supported DDL commands. Example usage for this would be:
Example:
Create a new publication with all ddl replication enabled:
CREATE PUBLICATION pub1 FOR ALL TABLES with (ddl = 'all');

Enable table ddl replication for an existing Publication:
ALTER PUBLICATION pub2 SET (ddl = 'table');

This is what seems to have been discussed but I think we can even
extend it to support based on operations/commands, say one would like
to publish only 'create' and 'drop' of tables. Then we can extend the
existing publish option to have values like 'create', 'alter', and
'drop'.

Another thing we are considering related to this is at what level
these additional options should be specified. We have three variants
FOR TABLE, FOR ALL TABLES, and FOR TABLES IN SCHEMA that enables
replication. Now, for the sake of simplicity, this new option is
discussed to be provided only with FOR ALL TABLES variant but I think
we can provide it with other variants with some additional
restrictions like with FOR TABLE, we can only specify 'alter' and
'drop' for publish option. Now, though possible, it brings additional
complexity to support it with variants other than FOR ALL TABLES
because then we need to ensure additional filtering and possible
modification of the content we have to send to downstream. So, we can
even decide to first support it only FOR ALL TABLES variant.

The other point to consider for publish option 'ddl = table' is
whether we need to allow replicating dependent objects like say some
user-defined type is used in the table. I guess the difficulty here
would be to identify which dependents we want to allow.

I think in the first version we should allow to replicate only some of
the objects instead of everything. For example, can we consider only
allowing tables and indexes in the first version? Then extend it in a
phased manner?

AFAICR, we have discussed two things related to security. (a)
ownership of objects created via DDL replication. We have discussed
providing an option at subscription level to allow objects to have the
same ownership (as it has on the publisher) after apply to the
subscriber. If that option is not enabled the objects will be owned by
the subscription owner. (b) Allow use of functions replicated to be
used even if they don't use schema qualify objects. Currently, we
override the search_path in apply worker to an empty string to ensure
that apply worker doesn't execute arbitrary expressions as it works
with the privileges of the subscription owner which would be a
superuser. We have discussed providing a search_path as an option at
the subscription level or a GUC to allow apply workers to use a
specified search_path.

Do you have anything else in mind?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alberto 2023-03-27 21:03:36 C function returning a tuple containing an array of tuples
Previous Message Amit Kapila 2023-03-27 06:37:55 Re: Support logical replication of DDLs

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2023-03-27 12:56:21 Re: MacOS: xsltproc fails with "warning: failed to load external entity"
Previous Message Daniel Gustafsson 2023-03-27 12:06:34 Re: MacOS: xsltproc fails with "warning: failed to load external entity"