Re: data transformation and replication

From: "Armand Pirvu (home)" <armand(dot)pirvu(at)gmail(dot)com>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: data transformation and replication
Date: 2017-05-09 14:56:59
Message-ID: 5203EE14-E80C-4397-AB47-08341BB7AFC5@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

9.5 both

But the enable always trigger I missed that

Once that set it runs

Thank you for your help

Armand

On May 9, 2017, at 8:26 AM, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> wrote:

> On 05/08/2017 08:31 PM, Armand Pirvu (home) wrote:
>> My bad
>> db1 I have two tables t1 and t2 (or more)
>> db2 has one table t3 for example which can get data aggregated from one or more multiple tables from the above set . I can updates/inserts/deletes in db1.t1 and/or db1.t2 which combined may mean related data in db.t3 would need to be inserted/deleted/updated. Think of it like ETL processing if you will. This is what I mean by data massaging/transformation
>> db1 and db2 are two different servers.
>
> What are the Postgres versions?
>
>> So I was initially thinking that I can have on db2 the same set of tables from db1, replication being done using pglogical. Once data gets to db2 t1 and t2, I can have on db2 a set of functions/triggers which can transform the data and as such do the relevant inserts/updates/delete from db2.t3
>> Apparently though that is not possible unless I am missing something
>
> Probably this:
>
> https://www.2ndquadrant.com/en/resources/pglogical/pglogical-docs/
>
> "4.11 Triggers
>
> Apply process and the initial COPY process both run with session_replication_role set to replica which means that ENABLE REPLICA and ENABLE ALWAYS triggers will be fired."
>
> https://www.postgresql.org/docs/9.6/static/sql-altertable.html
>
> "DISABLE/ENABLE [ REPLICA | ALWAYS ] TRIGGER
>
> ... The trigger firing mechanism is also affected by the configuration variable session_replication_role. Simply enabled triggers will fire when the replication role is "origin" (the default) or "local". Triggers configured as ENABLE REPLICA will only fire if the session is in "replica" mode, and triggers configured as ENABLE ALWAYS will fire regardless of the current replication mode.
> ...
> "
>
> So did you ENABLE REPLICA or ALWAYS on the db2 table triggers?
>
>
>> I reached that conclusion by using a trigger and a function like the auditing one to track insers/updates/deletes in an audit table
>> Having these said I was thinking
>> (a) -
>> On db1 I will have the t3 table as is on dsb2. All data transformation goes into db1.t3 which on it's turn will replicate to db2.t3 using pglogical
>> (b) -
>> On db2 I will have the t1 t2 as they are on db1. Those are replicated using Slony/Bucardo. Once data lands on db2.t1 and db2.t2 another set of triggers/functions responsible for data transformation will do the inserts/deletes/updates in db2.t3
>> I wold much prefer pglogical approach as stated in the what I see as a failed case
>> If the only options is Slony/Bucardo , so be it. but that begs the following questions
>> - which one has the smallest overhead ?
>> - which one is the easiest to manage ?
>> - which one is the most reliable ?
>> - I recall data transformation can be used in Bucardo but did not see any examples on that. Any pointers ?
>> Thanks
>> Armand
>> On May 8, 2017, at 4:49 PM, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com>> wrote:
>>> On 05/08/2017 12:46 PM, Armand Pirvu (home) wrote:
>>>> Hi
>>>>
>>>> Here it is a scenario which I am faced with and I am hoping to find a pointer/tip/help
>>>>
>>>> db1 is the OLTP system
>>>> db2 is the Reporting system
>>>>
>>>> The data from db1 needs to get to db2, but the database on those two have tables with different layout/structure and hence data will need to suffer some transformation in between in real time
>>>>
>>>> I was looking at something like
>>>>
>>>> db1 -> db2 replicates the same set of tables and with the same structures using pglogical for example
>>>> db2.tbl1 -> db2.tbl2 data gets massages/transformed based on what replicates from db1.tbl1 using triggers and functions
>>>>
>>>>
>>>> Other than that I reckon db1 -> db2 would be trigger based using something like slonik maybe (?) and data massage/transformation gets moved from db2 to db1 machine and then db1.tbl2 -> db2.tbl2 using pglogical
>>>
>>> I was following you until the last part, "... moved from db2 to db1 machine and then db1.tbl2 -> db2.tbl2 ..."
>>>
>>> Is this correct?
>>>
>>> If so why db1 --> db2 --> db1 --> db2?
>>>
>>> A complete answer is going to depend on at least an outline of what you mean by massage/transform?
>>>
>>>>
>>>>
>>>> Is this doable ? If so any pointers as to where to look about it ?
>>>>
>>>>
>>>> Many thanks
>>>> Armand
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Adrian Klaver
>>> adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com>
>
>
> --
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Steve Crawford 2017-05-09 17:19:39 Re: Python versus Other Languages using PostgreSQL
Previous Message George Neuner 2017-05-09 14:54:08 Re: Python versus Other Languages using PostgreSQL