From: | Josh Berkus <josh(at)agliodbs(dot)com> |
---|---|
To: | Noah Silverman <noah(at)allresearch(dot)com>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: One large v. many small |
Date: | 2003-01-30 17:56:56 |
Message-ID: | 200301300956.56041.josh@agliodbs.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-performance |
Noah,
> As we continue our evaluation of Postgres, another interesting topic
> has come up that I want to run by the group.
>
> In our current model, we have about 3,000 small tables that we use
> track data for our clients. Each table is an identical structure, and
> holds the data for one client.
I'd list what's wrong with this structure, but frankly it would take me long
enough that I'd need a consulting fee. Suffice it to say that the above is
a very, very bad (or at least antiquated) design idea and you need to
transition out of it as soon as possible.
> Another idea that we are considering is one big table instead of 3,000
> smaller ones. We could simply add a numeric field to indicate which
> client a particular record was for.
Yes. Absolutely. Although I'd suggest an Integer field.
> Each table has between 500 and 50,000 records, so the big table could
> have up to 10 million rows if we combined everything.
Sure.
> A query on our current system is (for client #4)
>
> Select (*) from client_4 where foo=2;
>
> A query from the new, proposed system would be
>
> Select (*) from big_results where client=4 and foo=2.
>
> The big questions is, WHICH WILL BE FASTER with Postgres. Is there any
> performance improvement or cost to switching to this new structure.
Oh, no question query 1 will be faster ... FOR THAT QUERY. You are asking the
wrong question.
However, explain to me how, under the current system, you can find the client
who ordered $3000 worth of widgets on January 12th if you don't already know
who it is? I'm not sure a 3000-table UNION query is even *possible*.
Or how about giving me the average number of customer transactions in a month,
across all clients?
<rant>
You've enslaved your application design to performance considerations ... an
approach which was valid in 1990, because processing power was so limited
then. But now that dual-processor servers with RAID can be had for less than
$3000, there's simply no excuse for violating the principles of good
relational database design just to speed up a query. Buying more RAM is
much cheaper than having an engineer spend 3 weeks fixing data integrity
problems.
The proper way to go about application design is to build your application on
paper or in a modelling program according to the best principles of software
design available, and *then* to discuss performance issues -- addressing them
*first* by buying hardware, and only compromising your applcation design when
no other alternative is available.
</rant>
I strongly suggest that you purchase Pascal's "Practical Issues in Database
Design" and give it a read.
--
Josh Berkus
Aglio Database Solutions
San Francisco
From | Date | Subject | |
---|---|---|---|
Next Message | Ron St.Pierre | 2003-01-30 17:59:17 | Re: Documentation needs significant improvement |
Previous Message | will trillich | 2003-01-30 17:36:40 | Re: Documentation needs significant improvement |
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Sullivan | 2003-01-30 18:02:40 | Re: One large v. many small |
Previous Message | Noah Silverman | 2003-01-30 17:34:36 | One large v. many small |