Data version idea (please discuss)

From: webb <wwsprague(at)ucdavis(dot)edu>
To: pgsql-general(at)postgresql(dot)org
Subject: Data version idea (please discuss)
Date: 2004-08-02 23:09:16
Message-ID: cemhit$c60$1@sea.gmane.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

(I did a *little* bit of searching on this, but I am not even sure what
keywords to use, so forgive me if I should just RTFM...)

I am interested in using PG for large datasets, like census records,
insurance claims, mortality occurences, etc, etc. The updates and
inserts would (I think) be batch oriented; we clean a bunch of records,
convert to a nice text file, do a big insert, repeat.

What I am curious about is versioning the data that goes into this
database using something that I want to call a "checkpoint". The
"use-case" would be that you do an insert of something like 100 records,
updating secondary tables as necessary, check it to the best of your
ability, then run Function_1. This function increments the version
number and returns it, storing whatever is necessary so that Function_2
can reset the database to any of the version numbers returned by
Function_1. I guess there would be a Function_3 that would add tables
to the checkpointing system.

Questions:

0. Does that make sense?

1. Is there some literature on this, so I don't have to keep bothering
the list with beginners questions?

2. Has somebody done the work already? I would think it would be
possible, using rules on all insert/delete/update, plus storing a few
other pieces of information with each row, plus maybe a table or two to
keep version information. You would never actually delete from a
versioned table, just change the current view, same with update. I
don't think you would ever actually have to write C, but you might have
to write some dynamic SQL so that you can iterate over lists of tables.

3. If the work has *not* been done, would it help anybody else to have
me do it? If so, please give feedback.

This seems related to replication, but I haven't looked into that yet.
I am fairly bright, but I have only a cursory background in the
theoretical stuff behind transactions and concurrent stuff.

Thanks to all.

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Darkcamel 2004-08-03 02:00:34 New to Postgres
Previous Message Steve Crawford 2004-08-02 22:56:15 autovacuum problems