Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility.

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Euler Taveira <euler(at)eulerto(dot)com>
Cc: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Shubham Khanna <khannashubham1197(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility.
Date: 2025-03-14 05:26:49
Message-ID: CAA4eK1K+NvDBjkdOdZaSWgokFHUrAL=iyhWW82EnMTX4Kto1Ag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 14, 2025 at 3:06 AM Euler Taveira <euler(at)eulerto(dot)com> wrote:
>
> On Wed, Mar 12, 2025, at 3:41 AM, Amit Kapila wrote:
>
> This could be another option, but I feel an option with a tool is more
> meaningful than allowing users to do afterward steps.
>
>
> I wasn't paying much attention to this discussion. The thread title refers to a
> general option to clean publisher objects which includes non specified objects.
> I was expecting a general solution but it seems to include only publications.
> Why? I envision in the future adding an option to publish only a set of tables.
> Will this proposal remove tables that were not published and its dependent
> objects (data types, functions, ...)?
>

We can add new publications via tool or once the subscriber is
created. This option and any other options are to remove objects that
are no longer needed from the previous set up of standby. One is
publications present which user may or may not need but it is
difficult to distinguish the publications copied from primary before
we have a subscriber. Now, the other things could be replication
slots or probably even databases (if user consider subscriber to
create subscriber from a specified set of databases). The idea to make
this solution general is that we provide switches like the current one
for different objects and then a common switch to remove all
pre-existing objects (like --remove-all-existing-objects). I am not
sure a generic switch like --remove-all-existing-objects is good
enough because users may want to retain few pre-existing objects like
subscriptions so that the new subscriber continure to get data from
other publishers.

>
When we add the initial schema
> synchronization for logical replication, will this proposal be aligned with it?
>

Can you please explain this a bit more to state what you have in mind for this?

> It is a mistake not to explore a general solution because you risk shooting
> yourself in the foot. If we consider that (a) the start point is a standby
> (physical copy) and (b) in most scenarios to setup a logical replica you use
> pg_dump that dumps the publications by default, it seems these additional
> objects will be expected by the user.
>

It is possible and that is why we are giving it as an option rather
than removing publications or other objects by default.

>
> I'm still concerned about the fact that we are adding one option that is
> specific and have to add one per object as soon as someone has another
> proposal. We need to decide if we want multiple options to clean up objects in
> the future or a general option that will be incrementally adding objects to
> remove. Multiple options are more granular and can avoid breaking backward
> compatibility (if you don't want) but can increase the list of options you need
> to inform if you want to clean multiple object types. A single option is not
> flexible and breaks backward compatibility but it does exactly what you want
> with a few characters. In most scenarios, if you want to have a clean
> subscriber, you will remove *all* possible objects, hence, a single option is
> your choice.
>

I agree that we need a single option, but I feel we need granular
options as well to allow users to retain objects selectively.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2025-03-14 05:58:22 Re: Reducing the log spam
Previous Message Hayato Kuroda (Fujitsu) 2025-03-14 05:04:20 RE: Skip collecting decoded changes of already-aborted transactions