Re: query speed joining tables

From: Vernon Wu <vernonw(at)gatewaytech(dot)com>
To: josh(at)agliodbs(dot)com, Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: query speed joining tables
Date: 2003-01-15 18:33:11
Message-ID: LK4WXSGCFVTCBB9ZUOI71542NHVUJE.3e25a967@kimiko
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql


Hi, Josh,

I appreciate you share your experience here. I definitely don't have that many years' DBA experience behind, and are
learning to get DB design right at the first place.

What I stated is my observation on my project with over twenty multivalued detail tables. I have a selection query
contained 200 characters, involving 10 tables, and using subquery. The performance is not bad after properly indexing,
least than 3 second (what the planner says). I will have longer queries later and hope they won't have any performance
problem.

Thank you for recommending another DB book after the "Database Design For Mere Mortals". I will read the book.

Vernon

15/01/2003 9:50:22 AM, "Josh Berkus" <josh(at)agliodbs(dot)com> wrote:

>Vernon,
>  
> > In regarding of recomposing multivalued field as a separated table,
>I
> > have observed some advantages and
>> disadvantages of the approach. Good on search as you have pointed out
> > and bad on updating data, two operations
>> needed: deletion and insertion. A query may need to join a lot of
> > table together. In Christ's personal application, for
>> example, there are many mulitvalued fields such as relationship
> > status other then ethnicity. There will be some very long
>> and complex queries.
>
> Hey, it's your database. In my 8-year experience as a professional
> DBA, few considerations ever outweigh normalization in a relational
> database. You are merely trading the immediate inconvenience of having
> to construct complex queries and data-saving functions for the
>eventual
> huge inconvenience (or possibly disaster) of having your data
>corrupted
> or at least having to modify it by hand, row-by-row.
>
>(Pardon me if I'm a little strident, but I've spend a good portion of
> my career cleaning up other's, and sometimes my own, database design
> mistakes and I had to see a disaster-in-the-making repeated)
>
>To put it another way: Your current strategy is saving a penny now in
> order to pay a dollar tommorrow.
>
> For example, you currently store multiple ethnicities in a free-form
> text field. What happens when:
> 1) Your organization decides they need to split "Asian" into "Chinese"
> and "Other Asian"?
> 2) Someone types "aisan" by mistake?
> 3) You stop tracking another ethnicity, and want to purge it from the
> database?
> 4) Your administrator decides that Ethnicity needs to be ordered as
> "primary ethnicity" and "other ethnicities"?
> 5) You need to do complex queries like (Asian and/or Caucasian but not
> Hispanic or African)? Your current strategy would require 4 seperate
> functional indexes to support that query, or do a table scan with 4
> row-by-row fuzzy text matches ... slow and memory-intensive either
>way.
>
>As I said, it's your database, and if it's a low-budget project
> destined to be thrown away in 3 months, then go for it. If, however,
> you expect this database to be around for a while, you owe it to
> yourself and your co-workers to design it right.
>
>If you want an education on database normalization, pick up Fabian
> Pascal's "Practical Issues in Database Design".
>
>-Josh Berkus
>

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Achilleus Mantzios 2003-01-15 18:55:18 Re: index on to_char(created, 'YYYY') doesn't work
Previous Message Otto Hirr 2003-01-15 18:30:50 Re: A brief guide to nulls