Re: schema design

From: "Michael Ansley (UK)" <Michael(dot)Ansley(at)intec-telecom-systems(dot)com>
To: 'Jodi Kanter' <jkanter(at)virginia(dot)edu>, Postgres Admin List <pgsql-admin(at)postgresql(dot)org>
Subject: Re: schema design
Date: 2002-11-06 17:55:47
Message-ID: 7F124BC48D56D411812500D0B747251401DA4514@FILESERVER002
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

According to your schema, different analyses linked to different file types
will not cause redundancies in the list of extensions. However, if you have
multiple file types for an extension, then you will get duplicates. I'm not
sure how likely this is, but it's possible in a generic situation, e.g.:
.doc could mean a Word doc, or a WP doc. In this situation, do you want one
extension record or two. If you only want one, then create a many-to-many
resolution table between file type and extension. If you are happy with
two, then your schema is fine.

Of course, you could always use a field array to hold the extensions, rather
than parsing a string field, but you will still get duplication. And that's
not very portable.

-----Original Message-----
From: Jodi Kanter [mailto:jkanter(at)virginia(dot)edu]
Sent: 06 November 2002 17:04
To: Postgres Admin List
Subject: [ADMIN] schema design

I have an interesting issue that I am working through and was hoping to get
some insight from the group. I will do my best to explain the scenario and
hope I do not confuse anyone.

We are storing data regarding some analyses that are being run here. We also
have a set of tables in which we store information "about" the types of
analyses being run. I am using these tables to do a validity check on what
file types (or input files) are acceptable for a certain type of analyses.

I have the following tables:

An ANALYSIS table which merely holds the name and some specifics about the
type of analysis.
A FILETYPES table which stores information about the types of input and
output files for each analysis. We have chosen to use file extensions as a
way to identify a file type.
Some analyses can have multiple input and/or output files so I have linking
tables between the ANALYSIS and FILETYPES table.

My question is due to a recent development. Apparently when an analysis is
run the individual can specify what type of file the output can be. Since
these tables are holding the data regarding the "form" of the analysis I
must somehow show that each analysis can result in various filetypes. I
thought I had done this and had an "extension" field in the FILETYPES table.
Now I am being told that each of these input or output files can have more
than one file extension.
My solution was to add another table to hold the extensions. I think this is
better than allowing extensions to concatenate within a field in the
filetypes table which was suggested by a programmer here. Her intention was
to parse the field. I prefer a more "appropriate" solution, but I am not
convinced mine is the best solution since there will be some repeating data
in the extensions table. Various analyses can be associated with the same
file types so the extension field in the extensions table will have
repeating values.
I have attached a small schema with these few tables to assist with your
understanding our issue.
I would appreciate suggestions.
Thank you,
Jodi Kanter

_______________________________
Jodi L Kanter
BioInformatics Database Administrator
University of Virginia
(434) 924-2846
jkanter(at)virginia(dot)edu <mailto:jkanter(at)virginia(dot)edu>

This e-mail and any attachments are confidential and may also be privileged and/or copyright
material of Intec Telecom Systems PLC (or its affiliated companies). If you are not an
intended or authorised recipient of this e-mail or have received it in error, please delete
it immediately and notify the sender by e-mail. In such a case, reading, reproducing,
printing or further dissemination of this e-mail is strictly prohibited and may be unlawful.
Intec Telecom Systems PLC. does not represent or warrant that an attachment hereto is free
from computer viruses or other defects. The opinions expressed in this e-mail and any
attachments may be those of the author and are not necessarily those of Intec Telecom
Systems PLC.

This footnote also confirms that this email message has been swept by
MIMEsweeper for the presence of computer viruses.

Browse pgsql-admin by date

  From Date Subject
Next Message Robert Treat 2002-11-06 18:42:36 Re: USERS
Previous Message Jodi Kanter 2002-11-06 17:04:07 schema design