Re: Postgres and data warehouses

From: Jerry Sievers <gsievers19(at)comcast(dot)net>
To: Nigel Gardiner <nigelgardiner(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Postgres and data warehouses
Date: 2015-03-09 17:23:01
Message-ID: 86twxuytje.fsf@jerry.enova.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Nigel Gardiner <nigelgardiner(at)gmail(dot)com> writes:

> I'm looking at making a data warehouse to address our rapidly spiralling report query times against the OLTP. I'm looking first at what it would take to make this a
> real-time data warehouse, as opposed to batch-driven.
>
> One approach I've seen used to achieve real time data warehousing is to have middleware that is intercepting all database writes and echoing them to a program that
> rolls up the data warehouse values and updates the facts, dimensions and so on on the fly. Another approach was to use triggers on the tables of interest to write to
> tables to journal the changes, which then get processed by a batch job to achieve the same thing.
>
> One of the problems of the trigger on the transactional database
> approach is that if there is ever a problem with the trigger, the main
> transaction is affected. I'm not sure if that is avoidable with proper
> exception handling in the trigger code? It does mean a lot of trigger
> code to maintain, and another schema to maintain (the journalled
> changes), so there were several drawbacks.

Firing a trigger on INS, UPD, DEL that simply loads a a journal table
with PK value and the event type is so utterly trivial as to be a
non-issue anywhere but the most high impact environments.

> I've had a quick search and haven't seen this approach used yet, but I was thinking, the asynchronous replication of Postgres databases could be used as a streaming
> journal of changes to be processed by a data warehouse. The other approach that suggests itself is WAL file shipping. I've not dug into the async rep protocol yet,
> before I do so I just wanted to get some brief feedback on whether I'm on the wrong track or not, and if there's some better approach I should be looking at first
> instead.

Consider if new Logical Change Set features of 9.4 might apply to your
case. May accomplish same as trigger based solution without all the
extra supporting structures hitherto necessary.

> Any feedback much appreciated.
>
> Regards,
> Nigel
>

--
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres(dot)consulting(at)comcast(dot)net
p: 312.241.7800

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Kevin Grittner 2015-03-09 17:24:59 Re: VACUUM FULL doesn't reduce table size
Previous Message Bruce Momjian 2015-03-09 16:55:13 Re: pg_upgrade failing from 9.3 to 9.4 because "template0" already exists