Re: Logical Replication - Single Destination Table With Multiple Source Tables - How to Handle Structure Changes

From: Vijaykumar Jain <vijaykumarjain(dot)github(at)gmail(dot)com>
To: Avi Weinberg <AviW(at)gilat(dot)com>
Cc: "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Logical Replication - Single Destination Table With Multiple Source Tables - How to Handle Structure Changes
Date: 2021-07-05 12:02:19
Message-ID: CAM+6J94bhgD8tRv1-OTBFwBZR_F_a71xxTKXj4SdcrdhVSNexw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, 5 Jul 2021 at 14:29, Avi Weinberg <AviW(at)gilat(dot)com> wrote:

> Thanks for the reply,
>
>
>
> My question was, what will happen if I have one destination table which
> gets data from many source tables. What is the best way to handle changes
> in the structure of SOME of the source tables, while other source tables
> remain in the old format.
>
> Maybe in some cases where the type was changed it may be able to work with
> source tables of different types, but what if column was renamed in one
> source table but the column remained with the old name in the other source
> table? What column name should the destination table have? Do I need to
> duplicate the column to have both old and new names?
>

ADD NEW column: [1]
add a new column to the destination first and then to the source(s).
the WALs would be retained by the publisher till it can start publishing
again, so no data loss.

ALTER COL: [2]
imho, add a new column of the to be modified datatype to both destination
and source as above in [1]. then write a trigger for source tables to sync
the columns which need to have the type changed to the new column added in
the same source tables. let those changes get published to destination.
(for ex. check the int to bigint migration in PG, like
int_to_bigint_migration
<https://techcommunity.microsoft.com/t5/azure-database-for-postgresql/postgres-tips-how-to-convert-2-billion-rows-to-bigint-with-citus/ba-p/1490128>
and then drop the old column at source and destination.

DROP COL [3]:
dropping-columns
<https://docs.gitlab.com/ee/development/avoiding_downtime_in_migrations.html#dropping-columns>
if there are apps making use of that column, first you would have to
ensure, the column is no longer in use in queries. then you can start by
dropping the col at source(s) and then the destination.

I have a setup that I have used with three source dbs and 1 dest db for
above. but things can get more complicated with FK constraints etc, so i am
not sure i have the best answer to this as i have not done it in production.
I have only used LR for migration and was thinking of (federated setup /to
unshard) where many shards -> LR -> one shard and when sharding was a bad
decision, but in both cases did not allow DDL changes at source till it was
completed.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Emiliano Saenz 2021-07-05 23:22:39 The Curious Case of the Table-Locking UPDATE Query
Previous Message Atul Kumar 2021-07-05 11:18:04 Re: number of wal file is not reduced.