From: | Joel Burton <jburton(at)scw(dot)org> |
---|---|
To: | Jimmie Fulton <JFulton(at)ehso(dot)emory(dot)edu> |
Cc: | pgsql-sql(at)postgresql(dot)org |
Subject: | RE: Database Design Question |
Date: | 2001-07-28 03:22:50 |
Message-ID: | Pine.LNX.4.21.0107272316430.18148-100000@olympus.scw.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
On Fri, 27 Jul 2001, Jimmie Fulton wrote:
> I'm by no means a crack database designer, but I do have my ideas on this
> subject. I prefer that every table has a unique integer (auto-incrementing)
> primary key. Why? Consistency. With the alternative, some tables may not
> have a clear-cut candidate for a unique id. In your example, you had
> "customer". How many "John Smith"s would it take before we decide that is
> not a good identifier. On the other hand, some tables would have perfectly
> logical identifiers. Part numbers, SSNs.... So, you would need to create
> some tables with integer primary keys, and others would have some other
> natural identifier. That to me is inconsistent. Every table should be,
> IMHO, predictable in it's definition of a primary key. I don't even have to
> guess what the names of my primary keys are either because the are all named
> <tablename>_ID. Always. I've only come up with these thoughts on my own,
> and have not extensively tried the other way, so I'd be interested in
> hearing other's ideas for the other side.
For large scale databases, there are theories aplenty about proper naming,
etc., and these should be be investigated. For small/simple databases,
this might be overkill in complexity and learning curve.
I teach a series of classes on small-scale database design for nonprofit
organizations, and recommend some simple rules:
* for tables where there is no candidate key (ie a Person table where
there is no SSN, etc.), use the table name + "id", and use a SERIAL-type.
* for tables where there is a candidate key, and that candidate key meets
all the usual criteria (always present, always unique, changes very
rarely), use the table name + "code", and use the appropriate type
(text/int/whatever), UNLESS
* there exists a very common name for this piece of info. Rather than
calling a SSN a "personcode" (in the above example), call it the SSN
* always put the primary key first in the table
Why not always use a SERIAL integer? It's a pain when a perfectly good
value exists. For example, in a table keeping track of US states, their
capitals, and governors, does it really make sense to code Maryland as
"45", when a memorable, unique, unlikely-to-change code already exists
("md")? Using a random number when a real-world code could do only forces
your user to do that lookup themselves.
[apologies to the international readers: Maryland is a state in the USA,
and "MD" is the postal code abbreviation for it]
I think that you could make some basic rules that would give you a system
that is intuitive and easy-to-rememeber, _without_ sacrificing clarity for
rigidity.
hth,
--
Joel Burton <jburton(at)scw(dot)org>
Director of Information Systems, Support Center of Washington
From | Date | Subject | |
---|---|---|---|
Next Message | Stephan Szabo | 2001-07-28 03:48:44 | Re: Why does this plpgslq always return 1? |
Previous Message | John Oakes | 2001-07-28 03:00:21 | Why does this plpgslq always return 1? |