journaling / time travel

From: Willy-Bas Loos <willybas(at)gmail(dot)com>
To: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: journaling / time travel
Date: 2016-09-19 15:48:20
Message-ID: CAHnozTgnB1_b0zEDw+rFx70CP_HGkskHyVToSwA1NqOxwbw+rA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

In a joint effort with some parties, we have created a large database (1.1
TB) of records that change only occasionally. Mainly, more of them are
added to the database in a slow rate (1.1 TB built up in 10 years).
The records can have some significance in legal cases. Since records can be
changed afterwards, it has been argued that we should have "journaling",
meaning that every change to the data is saved in a separate schema that
holds a "journaling" copy of each table in the datamodel, including
lookups. So theoretically it is possible to go back to the data at one
exact point in time. This is stored in the same database.
By now, this journaling schema takes up 624GB.
I once looked into a contrib module called time travel that does something
similar, hence the title.

The use case of legal disputes being fought with our data as evidence and
digging up the exact data from a certain point of time never occurred in
those 10 years, and it is unlikely that it ever will.
But it might, if anyone could reasonably expect this to be possible.

Also, it has been argued that this journal is a good thing, because it
could save us in case of a unforeseen creeping corruption of data or some
other catastrophe that isn't covered by our backup system. Someone
described this as a good practice, resulting in the fact that we still do
this.

Now, i don't like this blown up journal that we don't ever use, so i would
love to lose it.
But indeed we should do anything we reasonably can to protect that data.

My question to you all is:
* Is the legal thing actualy something one could expect of us?
* Is the security thing really a good practice?
* Is this a common use case that is normally solved with standard
components?

One thing: PITR would be a candidate for a standard component, but it takes
up much more space. We could double the retention time if the journal were
to be axed, but that would probably not be enough.

Cheers,

--
Willy-Bas Loos

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Willy-Bas Loos 2016-09-19 16:04:25 Re: journaling / time travel
Previous Message Karsten Hilbert 2016-09-19 13:27:22 Re: Index scan is not working