From: | "Josh Berkus" <josh(at)agliodbs(dot)com> |
---|---|
To: | vernonw(at)gatewaytech(dot)com, josh(at)agliodbs(dot)com |
Cc: | pgsql-sql(at)postgresql(dot)org |
Subject: | Re: query speed joining tables |
Date: | 2003-01-15 17:50:22 |
Message-ID: | web-2314937@davinci.ethosmedia.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
Vernon,
> In regarding of recomposing multivalued field as a separated table,
I
> have observed some advantages and
> disadvantages of the approach. Good on search as you have pointed out
> and bad on updating data, two operations
> needed: deletion and insertion. A query may need to join a lot of
> table together. In Christ's personal application, for
> example, there are many mulitvalued fields such as relationship
> status other then ethnicity. There will be some very long
> and complex queries.
Hey, it's your database. In my 8-year experience as a professional
DBA, few considerations ever outweigh normalization in a relational
database. You are merely trading the immediate inconvenience of having
to construct complex queries and data-saving functions for the
eventual
huge inconvenience (or possibly disaster) of having your data
corrupted
or at least having to modify it by hand, row-by-row.
(Pardon me if I'm a little strident, but I've spend a good portion of
my career cleaning up other's, and sometimes my own, database design
mistakes and I had to see a disaster-in-the-making repeated)
To put it another way: Your current strategy is saving a penny now in
order to pay a dollar tommorrow.
For example, you currently store multiple ethnicities in a free-form
text field. What happens when:
1) Your organization decides they need to split "Asian" into "Chinese"
and "Other Asian"?
2) Someone types "aisan" by mistake?
3) You stop tracking another ethnicity, and want to purge it from the
database?
4) Your administrator decides that Ethnicity needs to be ordered as
"primary ethnicity" and "other ethnicities"?
5) You need to do complex queries like (Asian and/or Caucasian but not
Hispanic or African)? Your current strategy would require 4 seperate
functional indexes to support that query, or do a table scan with 4
row-by-row fuzzy text matches ... slow and memory-intensive either
way.
As I said, it's your database, and if it's a low-budget project
destined to be thrown away in 3 months, then go for it. If, however,
you expect this database to be around for a while, you owe it to
yourself and your co-workers to design it right.
If you want an education on database normalization, pick up Fabian
Pascal's "Practical Issues in Database Design".
-Josh Berkus
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2003-01-15 18:15:55 | Re: index on to_char(created, 'YYYY') doesn't work |
Previous Message | Andreas Joseph Krogh | 2003-01-15 17:38:00 | Re: index on to_char(created, 'YYYY') doesn't work |