Quick Links

Re: Strategy for materialisation and centralisation of data

From:	Michael Stephenson <domehead100(at)gmail(dot)com>
To:
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Strategy for materialisation and centralisation of data
Date:	2020-05-21 21:06:48
Message-ID:	CAHJZW0=zDHUhrKGAPYE-ekRT9FRmHfsASSnpHev+cC-5K9=bbQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

You might find Materialize interesting:

https://materialize.io/

https://youtu.be/zWSdkGq1XWk

On Thu, May 21, 2020 at 10:36 AM Rory Campbell-Lange <
rory(at)campbell-lange(dot)net> wrote:

> We have quite a few databases of type a and many of type b in a cluster.
> Both a and b types are fairly complex and are different solutions to a
> similar problem domain. All the databases are very read-centric, and all
> database interaction is currently through plpgsql with no materialised
> data.
>
> Some organisations have several type a and many type b databases, and
> need to query these in a homogeneous manner. We presently do this with
> many middleware requests or pl/proxy. An a or b type database belongs to
> 0 or 1 organisations.
>
> Making a and b generally the same would be a very big project.
> Consequently I'm discussing materialising a subset of data in a common
> format between the two database types and shipping that data to
> organisation databases. This would have the benefit of providing a
> common data interface and speeding up queries for all database types.
> Users would have faster queries, and it would be a big time saver for
> our development team, who presently have to deal with three quite
> different data APIs.
>
> Presently I've been thinking of using triggers or materialized views in
> each database to materialise data into a "matview" schema which is then
> shipped via logical replication to an organisation database when
> required. New columns in the matview schema tables would ensure replica
> identity uniqueness and allow the data to be safely stored in common
> tables in the organisation database.
>
> A few issues I foresee with this approach include:
>
> * requiring two to three times current storage for materialisation
> (the cluster is currently ~250GB)
>
> * having to have many logical replication slots
> (we sometimes suffer from pl/proxy connection storms)
>
> Commentary gratefully received,
> Rory
>
>
>
>
>
>

In response to

Strategy for materialisation and centralisation of data at 2020-05-21 14:35:59 from Rory Campbell-Lange

Responses

Re: Strategy for materialisation and centralisation of data at 2020-05-21 21:19:21 from Rory Campbell-Lange

Browse pgsql-general by date

	From	Date	Subject
Next Message	Rory Campbell-Lange	2020-05-21 21:19:21	Re: Strategy for materialisation and centralisation of data
Previous Message	Zahir Lalani	2020-05-21 20:58:32	RE: Should I use JSON?