Postgres and data warehouses

From: Nigel Gardiner <nigelgardiner(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Postgres and data warehouses
Date: 2015-03-08 14:40:05
Message-ID: CAAZQc97ctiRB9kvwSkmipKaCX57P17=T+htEi5fEw_EBDGD=Jg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I'm looking at making a data warehouse to address our rapidly spiralling
report query times against the OLTP. I'm looking first at what it would
take to make this a real-time data warehouse, as opposed to batch-driven.

One approach I've seen used to achieve real time data warehousing is to
have middleware that is intercepting all database writes and echoing them
to a program that rolls up the data warehouse values and updates the facts,
dimensions and so on on the fly. Another approach was to use triggers on
the tables of interest to write to tables to journal the changes, which
then get processed by a batch job to achieve the same thing.

One of the problems of the trigger on the transactional database approach
is that if there is ever a problem with the trigger, the main transaction
is affected. I'm not sure if that is avoidable with proper exception
handling in the trigger code? It does mean a lot of trigger code to
maintain, and another schema to maintain (the journalled changes), so there
were several drawbacks.

I've had a quick search and haven't seen this approach used yet, but I was
thinking, the asynchronous replication of Postgres databases could be used
as a streaming journal of changes to be processed by a data warehouse. The
other approach that suggests itself is WAL file shipping. I've not dug into
the async rep protocol yet, before I do so I just wanted to get some brief
feedback on whether I'm on the wrong track or not, and if there's some
better approach I should be looking at first instead.

Any feedback much appreciated.

Regards,
Nigel

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Stephen Frost 2015-03-08 15:20:16 Re: Postgres and data warehouses
Previous Message Patrick Dung 2015-03-08 05:33:11 Re: Find similar records (compare tsvectors)