Re: Creating complex track changes database - challenge!

From: Manuel Gómez <targen(at)gmail(dot)com>
To: Łukasz Jarych <jaryszek(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Creating complex track changes database - challenge!
Date: 2018-02-26 11:04:11
Message-ID: CAJWnFaMMpJT-way0D6o7NJtimrff1soSJxp2PigMgjVkqWOXdA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, Feb 26, 2018 at 11:44 AM Łukasz Jarych <jaryszek(at)gmail(dot)com> wrote:

> I have to:
>
> 1. Keep all changes within table including:
> -adding rows
> -deleting
> -editing
>
> 2. Save table with specific state and recover specific state (so go back
> to previous table versions) including comparing tables.
>
> 3. Track all DLL and DML changes with possibility to ho back to previous
> version.
>

Hi,

I had similar needs long ago, so I wrote this tool I called Squealer, which
would transform a specification of a SQL database schema into some
PostgreSQL DDL to create a database that implements the same schema in a
logical sense, but actually stores all historical rows, and even permits
existing rows to have references to (soft-)deleted rows, all while
providing modifiable views that simulate the behavior of a regular table as
specified in the input schema through generous use of INSTEAD OF triggers.
It works somewhat like having version control for your database.

You may find the source code here: https://github.com/mgomezch/squealer
Unfortunately, it has hardly any comments, it is completely unmaintained
and probably unused anywhere, I have no idea whether it builds with today's
libraries, and it does not necessarily break the tradeoffs in this space in
a way that fits your use case.

Note there are major caveats with keeping all historical data around
forever, and the semantics of querying historical data can get complicated,
let alone having current data refer to deleted, historical data. I built
this for a very specific use case where this was the right design, but
please consider very carefully whether this is what you want.

Storing your database history forever would take a lot of space. Consider
whether you can instead keep a record of changes stored outside the
database in some cheap cold storage. Also consider just keeping a set of
tables with dynamically structured event records (e.g. JSON fields)
partitioned by time ranges and retained only temporarily, perhaps even in a
separate database. Any such solution will have significant cost and
performance impact if your database bears a nontrivial load, so be careful.

You could also just place your database on a PostgreSQL cluster by itself
and then keep all WAL segments archived forever, so you could just do
point-in-time recovery to any point in the history of your database. The
space required would grow very quickly, though, so if you don't really need
the full history forever, but only a fixed retention period, you can surely
use any of the well-known solutions for PostgreSQL backups that allow for
this through WAL archiving and periodic basebackups: e.g. WAL-E, WAL-G,
pgBackRest…

If all you need is saving and restoring specific table states, logical
dumps with pg_dump should probably be enough for your needs.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Thomas Kellerer 2018-02-26 11:09:19 Re: Creating complex track changes database - challenge!
Previous Message Łukasz Jarych 2018-02-26 10:44:13 Creating complex track changes database - challenge!