From: | Ralph Graulich <maillist(at)shauny(dot)de> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | SQL logic for version history including table |
Date: | 2002-08-03 09:43:41 |
Message-ID: | Pine.LNX.4.21.0208031126310.1995-100000@shauny.shauny.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi,
I've got a table which includes complete versioning and history for each
entry. The fields in question are:
- createdate (date of creation of this row)
- vno (version number, increased by 1 for each new version)
- active ('Y' for newest version, 'N' for all older ones, simply set to
'N' for all versions, if a row is deleted)
- dam_id (key value, but not unique, same value for each
version of an entry)
I want to select the following data:
(1) All the changes, which were made since a specific date in the past,
ordered by their date of creation descending and a name field in ascending
way.
(2) As I want to set a special mark in the output if that was a new entry
or just the change of an old existing entry, I need to check the version
number for each date of creation.
(3) As there can be multiple versions on one day, I need to check for the
lowest version number on each date of creation to see, wether it really
was a change or a new entry.
(4) Finally I need to exclude all the entries, if they don't have a valid
entry on the current day any more, which means they were deleted after the
date the last entry was made -> short: only entrys which have a active='Y'
entry are allowed to be selected.
Thought quite a while about it and came to the following conclusion:
SELECT d.field1,
d.field2, ...
d.createdate,
(SELECT d4.createdate
FROM dam d4
WHERE d.dam_id = d4.dam_id
ORDER BY createdate ASC LIMIT 1) AS firstdate
FROM dam d
WHERE createdate >= '1999-06-01'
AND vno=(SELECT d2.vno
FROM dam d2
WHERE d2.dam_id=d.dam_id
AND d2.createdate=d.createdate
ORDER BY vno ASC LIMIT 1)
AND EXISTS
(SELECT *
FROM dam d3
WHERE d3.dam_id=d.dam_id
AND d3.active=d.active
AND d.active='Y')
ORDER BY
createdate DESC,
dam ASC
LIMIT 200;
The explain plan looks like:
NOTICE: QUERY PLAN:
Limit (cost=137625.77..137625.77 rows=200 width=73)
-> Sort (cost=137625.77..137625.77 rows=395 width=73)
-> Seq Scan on dam d (cost=0.00..137608.74 rows=395 width=73)
SubPlan
-> Limit (cost=7.24..7.24 rows=1 width=4)
-> Sort (cost=7.24..7.24 rows=1 width=4)
-> Index Scan using ix_dam_dam_id on dam d4
(cost=0.00..7.23 rows=1 width=4)
-> Limit (cost=7.24..7.24 rows=1 width=4)
-> Sort (cost=7.24..7.24 rows=1 width=4)
-> Index Scan using ix_dam_dam_id on dam d2
(cost=0.00..7.23 rows=1 width=4)
-> Result (cost=0.00..7.23 rows=1 width=287)
-> Index Scan using ix_dam_dam_id on dam d3
(cost=0.00..7.23 rows=1 width=287)
Maybe someone wants to discuss about that approach with me or wants to
point out possible errors. All opinions are welcome.
Kind regards
... Ralph ...
From | Date | Subject | |
---|---|---|---|
Next Message | Cédric Dufour | 2002-08-03 09:50:50 | Re: b1 OR b2 <-> ( CASE WHEN b1 THE true ELSE b2 END ): performance bottleneck on logical OR |
Previous Message | HT&T | 2002-08-03 08:55:16 | Re: MySQL or Postgres ? |