Re: indexing of hierarchical data

From: "Josh Berkus" <josh(at)agliodbs(dot)com>
To: Dado Feigenblatt <dado(at)wildbrain(dot)com>, pgsql-sql(at)postgresql(dot)org
Subject: Re: indexing of hierarchical data
Date: 2001-07-06 23:59:42
Message-ID: web-82025@davinci.ethosmedia.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Dado,

Yeah, me again. What can I say? I'm procrastinating, and list stuff
is a great way to do it. And I prefer design theory issues.

project    sequence         shot

CopFilm1   alley shooting   death of the bad guy
CopFilm2   car chase        death of the bad guy
CopFilm3   car chase        death of the bad guy

At first I was indexing the shots just buy shot_ID (serial), and storing
the sequence_ID it belongs to.
On the sequence record, I was storing the project_ID it belongs to.

This sounds fine so far.

So if I wanted to select CopFilm3, car chase, death of the bad guy
I had to find the ID of the project CopFilm3, the ID of the sequence car
chase belonging to that project and then shot death of the bad guy
belonging to that sequence.
As most of the operations happen at the shot level, for performance
reasons I think it might be better to store the project and sequence
with the shot, so I don't have to perform any joins.

What's wrong with joins?

Also, projects and sequences have alphabetical codes assigned to them,
which is usually the prefered way of accessing the data.
So, it is my impression that I should store those codes in the shot as
foreign keys with on update cascade
should someone decide to rename projects and sequences, and their codes,
which happens.
Is this approach ok or should I stick to serial ID's and make the
lookups?
Any comment on problems like this?

Yes. You want to have as primary and foriegn keys values that do not
change over the lifetime of the record. Any time that you choose
instead user-modifiable records you are introducing a world of headaches
and trigger maintainence.

The solution to the above design problem is simple:

table projects
project_ID SERIAL PRIMARY KEY
project_code VARCHAR(5) NOT NULL
project_name VARCHAR(100) NOT NULL

table sequences
sequence_ID SERIAL PRIMARY KEY
project_ID INT NOT NULL REFERENCES projects(project_ID)
sequence_code VARCHAR(5) NOT NULL
sequence_name VARCHAR(100) NOT NULL

table shots
shot_ID SERIAL PRIMARY KEY
sequence_ID INT NOT NULL REFERENCES sequences(sequence_ID)
shot_code VARCHAR(5) NOT NULL
shot_name VARCHAR(100) NOT NULL

CREATE VIEW vw_shots AS
SELECT project_code, project_name, sequence_code, sequence_name,
shot_code, shot_name
FROM projects JOIN sequences USING (project_ID)
JOIN shots USING (sequence_ID);

Then you users can access this view, and search by codes without being
aware that the numerical ID's even exist. For that matter, you can
impose UNIQUE constraints on codes within their context without tying up
those codes for all time or preventing your users from changing the
codes.

-Josh

______AGLIO DATABASE SOLUTIONS___________________________
Josh Berkus
Complete information technology josh(at)agliodbs(dot)com
and data management solutions (415) 565-7293
for law firms, small businesses fax 621-2533
and non-profit organizations. San Francisco

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Dado Feigenblatt 2001-07-07 00:20:36 Re: indexing of hierarchical data
Previous Message Dado Feigenblatt 2001-07-06 23:15:46 indexing of hierarchical data