Re: data transformation and replication

From: "Armand Pirvu (home)" <armand(dot)pirvu(at)gmail(dot)com>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: data transformation and replication
Date: 2017-05-09 03:31:12
Message-ID: E78E286A-40E6-4B6D-82FD-75595F8E7579@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

My bad

db1 I have two tables t1 and t2 (or more)
db2 has one table t3 for example which can get data aggregated from one or more multiple tables from the above set . I can updates/inserts/deletes in db1.t1 and/or db1.t2 which combined may mean related data in db.t3 would need to be inserted/deleted/updated. Think of it like ETL processing if you will. This is what I mean by data massaging/transformation

db1 and db2 are two different servers.

So I was initially thinking that I can have on db2 the same set of tables from db1, replication being done using pglogical. Once data gets to db2 t1 and t2, I can have on db2 a set of functions/triggers which can transform the data and as such do the relevant inserts/updates/delete from db2.t3

Apparently though that is not possible unless I am missing something

I reached that conclusion by using a trigger and a function like the auditing one to track insers/updates/deletes in an audit table

Having these said I was thinking

(a) -
On db1 I will have the t3 table as is on dsb2. All data transformation goes into db1.t3 which on it's turn will replicate to db2.t3 using pglogical

(b) -
On db2 I will have the t1 t2 as they are on db1. Those are replicated using Slony/Bucardo. Once data lands on db2.t1 and db2.t2 another set of triggers/functions responsible for data transformation will do the inserts/deletes/updates in db2.t3

I wold much prefer pglogical approach as stated in the what I see as a failed case

If the only options is Slony/Bucardo , so be it. but that begs the following questions
- which one has the smallest overhead ?
- which one is the easiest to manage ?
- which one is the most reliable ?
- I recall data transformation can be used in Bucardo but did not see any examples on that. Any pointers ?

Thanks
Armand

On May 8, 2017, at 4:49 PM, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> wrote:

> On 05/08/2017 12:46 PM, Armand Pirvu (home) wrote:
>> Hi
>>
>> Here it is a scenario which I am faced with and I am hoping to find a pointer/tip/help
>>
>> db1 is the OLTP system
>> db2 is the Reporting system
>>
>> The data from db1 needs to get to db2, but the database on those two have tables with different layout/structure and hence data will need to suffer some transformation in between in real time
>>
>> I was looking at something like
>>
>> db1 -> db2 replicates the same set of tables and with the same structures using pglogical for example
>> db2.tbl1 -> db2.tbl2 data gets massages/transformed based on what replicates from db1.tbl1 using triggers and functions
>>
>>
>> Other than that I reckon db1 -> db2 would be trigger based using something like slonik maybe (?) and data massage/transformation gets moved from db2 to db1 machine and then db1.tbl2 -> db2.tbl2 using pglogical
>
> I was following you until the last part, "... moved from db2 to db1 machine and then db1.tbl2 -> db2.tbl2 ..."
>
> Is this correct?
>
> If so why db1 --> db2 --> db1 --> db2?
>
> A complete answer is going to depend on at least an outline of what you mean by massage/transform?
>
>>
>>
>> Is this doable ? If so any pointers as to where to look about it ?
>>
>>
>> Many thanks
>> Armand
>>
>>
>>
>>
>
>
> --
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Neil Anderson 2017-05-09 04:00:13 Re: Python versus Other Languages using PostgreSQL
Previous Message tao tony 2017-05-09 01:37:12 slow query on multiple table join