Re: query speed joining tables

From: "Josh Berkus" <josh(at)agliodbs(dot)com>
To: vernonw(at)gatewaytech(dot)com, josh(at)agliodbs(dot)com
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: query speed joining tables
Date: 2003-01-15 17:50:22
Message-ID: web-2314937@davinci.ethosmedia.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Vernon,
 
> In regarding of recomposing multivalued field as a separated table,
I
> have observed some advantages and
> disadvantages of the approach. Good on search as you have pointed out
> and bad on updating data, two operations
> needed: deletion and insertion. A query may need to join a lot of
> table together. In Christ's personal application, for
> example, there are many mulitvalued fields such as relationship
> status other then ethnicity. There will be some very long
> and complex queries.

Hey, it's your database. In my 8-year experience as a professional
DBA, few considerations ever outweigh normalization in a relational
database. You are merely trading the immediate inconvenience of having
to construct complex queries and data-saving functions for the
eventual
huge inconvenience (or possibly disaster) of having your data
corrupted
or at least having to modify it by hand, row-by-row.

(Pardon me if I'm a little strident, but I've spend a good portion of
my career cleaning up other's, and sometimes my own, database design
mistakes and I had to see a disaster-in-the-making repeated)

To put it another way: Your current strategy is saving a penny now in
order to pay a dollar tommorrow.

For example, you currently store multiple ethnicities in a free-form
text field. What happens when:
1) Your organization decides they need to split "Asian" into "Chinese"
and "Other Asian"?
2) Someone types "aisan" by mistake?
3) You stop tracking another ethnicity, and want to purge it from the
database?
4) Your administrator decides that Ethnicity needs to be ordered as
"primary ethnicity" and "other ethnicities"?
5) You need to do complex queries like (Asian and/or Caucasian but not
Hispanic or African)? Your current strategy would require 4 seperate
functional indexes to support that query, or do a table scan with 4
row-by-row fuzzy text matches ... slow and memory-intensive either
way.

As I said, it's your database, and if it's a low-budget project
destined to be thrown away in 3 months, then go for it. If, however,
you expect this database to be around for a while, you owe it to
yourself and your co-workers to design it right.

If you want an education on database normalization, pick up Fabian
Pascal's "Practical Issues in Database Design".

-Josh Berkus

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Josh Berkus 2003-01-15 18:15:55 Re: index on to_char(created, 'YYYY') doesn't work
Previous Message Andreas Joseph Krogh 2003-01-15 17:38:00 Re: index on to_char(created, 'YYYY') doesn't work