Re: table versioning approach (not auditing)

From: Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>
To: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>, Adam Brusselback <adambrusselback(at)gmail(dot)com>, Felix Kunde <felix-kunde(at)gmx(dot)de>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: table versioning approach (not auditing)
Date: 2014-10-08 03:40:59
Message-ID: 5434B24B.6020401@archidevsys.co.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 08/10/14 13:29, Jim Nasby wrote:
> On 10/6/14, 6:10 PM, Gavin Flower wrote:
>> Even if timestamps are used extensively, you'd have to be careful
>> joining on them. You may have information valid at T1 and changing at
>> T3, but the transaction has T2, where T1 < T2 < T3 - the appropriate
>> set of data would be associated with T1, would would not get anywhere
>> trying to find data with a timestamp of T2 (unless you were very
>> lucky!).
>
> Yeah, this is why I think timestamps need to be shunned in favor of
> explicit pointers. Anyone that thinks timestamps are good enough
> hasn't thought the problem through completely. :)
>
> I also think there's potential value to storing full transaction
> information (presumably in a separate table): txid_current(),
> txid_current_snapshot(), now(), current_user, maybe some other stuff
> (client IP address?). That way you can tell exactly what created a
> history record. With appropriate shenanigans you can theoretically
> determine exactly what other history data would be visible at that
> time without using pointers (but man would that bu ugly!)
>
>> Actually things like phone numbers are tricky. Sometimes you may
>> want to use the current phone number, and not the one extant at that
>> time (as you want to phone the contact now), or you may still want
>> the old phone number (was the call to a specific number at date/time
>> legitimate & who do we charge the cost of the call too).
>
> Yeah, I'm pretty convinced at this point that history/versioning
> should be built on top of a schema that always contains the current
> information, if for no other reason than so you always have a PK that
> points to what's current in addition to your history PKs.
One of the motivations for having an effective_date, was being able to
put changes into the database ahead of time.

Finding the current value uses the same logic a find the value at any
other date/time - so you don't need a special schema to distinguish the
current state from anything else. For example:

DROP TABLE IF EXISTS stock;

CREATE TABLE stock
(
id text,
effective_date timestamptz,
price numeric
);

INSERT INTO stock
(
id,
effective_date,
price
)
VALUES
('y88', '2014-10-01', 12.0),
('x42', '2014-10-01', 12.1),
('x42', '2014-10-08', 12.2),
('x42', '2014-10-10', 12.3),
('x42', '2014-10-16', 12.4),
('z42', '2014-10-19', 12.5),
('z49', '2014-10-01', 12.6),
('z49', '2014-10-30', 12.7),
('z77', '2014-10-01', 12.8);

CREATE UNIQUE INDEX primary_key ON stock (id ASC, effective_date DESC);

SELECT
s.price
FROM
stock s
WHERE
s.id = 'x42'
AND s.effective_date <= '2014-10-11'
ORDER BY
s.effective_date DESC
LIMIT 1;

Cheers,
Gavin

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Stephen Davies 2014-10-08 03:51:51 psql connection issue
Previous Message Jim Nasby 2014-10-08 00:33:22 Re: Converting char to varchar automatically