From: | Josh Berkus <josh(at)agliodbs(dot)com> |
---|---|
To: | Michael Glaesmann <grzm(at)myrealbox(dot)com>, pgsql-novice(at)postgresql(dot)org |
Subject: | Re: Almost relational PostgreSQL (was: one-to-one) |
Date: | 2003-10-21 20:10:20 |
Message-ID: | 200310211310.20299.josh@agliodbs.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-novice |
Michael,
> From what I've read of PostgreSQL, it's a bit closer to true relational
> than SQL, or at least you can limit yourself from using some of the
> non-relational bits of SQL, such as always using NOT NULL in column
> definitions, and using DISTINCT to eliminate duplicates in results
> relations. I was wondering if you would share some of techniques you
> would recommend using in PostgreSQL to make a database 'more
> relational', or know of any sources on the web that I might find more
> about this specifically.
Well, if you've already read Date, Darwin & Pascal, I think you've pretty much
covered things. If you don't have it already, Pascal's "Practical Issues in
Database Management" is a good place to start; for one thing, it's the
clearest description of the Normal Forms I've seen.
Unfortunately, a lot of the relational functionality that SQL (and PostgreSQL)
is missing can only be awkwardly implemented using PostgreSQL Triggers, Views
and Rules. More is coming, though; when we have fully-functional
statement-level triggers in 7.5, and when/if Neil finishes implementing an
updatable view default, implementing distributed keys and set-based keys will
become a lot more possible.
As much as it is consistent with Codd, though, I don't recommend using
DISTINCT with everything; it's a performance-killer. Neil and I were just
hashing this out on IRC.
Another tip: beware of over-reliance on surrogate autoincrement keys. They
are convenient and necessary some of the time, but use real keys where you
can. The current industry reliance on "ID" primary keys encourages sloppy
thinking .... and sloppy schema ... by DBAs. I have caught myself building
tables without any real keys this way.
For that matter, the term "Primary Key" is inherently meaningless anyway.
Don't get attached to it.
Maybe more later if I feel like it .... more likely, you've inspired me to a
magazine article.
> Also, I often read about denormalization a database for performance
> reasons. My understanding of this is not that normalization is a
> problem in and of itself, but that the DBMS hasn't been properly
> designed to handle highly normalized databases, which I gather has to
> do with the fact the number of joins tends to increase with
> normalization. However, I haven't seen much about this with respect to
> PostgreSQL, or any particular DBMS, for that matter. Discussion seems
> to always be on the theoretical level, though there should be
> differences in performance as the engines underlying the different
> DBMSs are different. I'd be interested in hearing your views on
> practical normalization of PostgreSQL databases, or alternatively know
> of places where I might find such discussion.
My attitude? Construct your database according to strict normal principles,
and then see if you have a performance problem. IMNSHO, 90% of the people
advocating denormalization haven't really tested; they are either making
assumptions based on outdated knowledge, or using "performance" as an excuse
for bad database design.
My personal limit of denormalization stops at a few NULL columns and using
cache tables to hold copies of views which are too slow.
Want to discuss this further? Take it to the PGSQL-SQL list and/or the
#postgresql channel on irc.freenode.net.
--
-Josh Berkus
Aglio Database Solutions
San Francisco
From | Date | Subject | |
---|---|---|---|
Next Message | Cath Lawrence | 2003-10-22 02:25:21 | Loading speed |
Previous Message | Michael Glaesmann | 2003-10-21 19:16:16 | Almost relational PostgreSQL (was: one-to-one) |