From: | Benjamin Smith <lists(at)benjamindsmith(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Partitioning Vs. Split Databases - performance? |
Date: | 2006-12-21 22:07:13 |
Message-ID: | 200612211407.14017.lists@benjamindsmith.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Thursday 21 December 2006 11:47, Ron Johnson wrote:
> This gives you linear growth potential, since if your current box
> gets over-utilized, buy a 2nd box and move some of the databases to it.
So far, I'm inclined to go this way, due to the option for linear scaling.
> >> 2) Copy out the data specific to a customer and load into separate tables
> >> (with slightly different names, EG table "dates" becomes "cust1_dates")
> >> and
> >> use data partitioning to help with performance as needed.
>
> Definitely *not* scalable. And *very* messy. Yech.
Scales better than present, methinks, but still not the best idea. I'd have to
revisit all my queries to make sure that they use the correct tablename.
> > 3) Put each customer in their own schema/namespace which resides within
> > its own table space.
> >
> > Then you can move customers wherever you need in terms of IO.
How is that functionally different than using a separate database? What's the
advantage here? I don't *need* to restrict myself to one database, and doing
this does require that I revisit 100% of the SQL queries to make sure that
I'm referencing the right schema.
This solution seems to have the same problems as using dynamic tablenames.
> Splitting like mentioned in these three tactics means that you've
> now got 2x as many tables. Add more customers and you've got that
> many more tables. Perfect candidate for "schema drift".
>
> If each table has cust_id in it, then you could:
>
> 4) retain 1 database and partition each table on cust_id.
>
> http://www.postgresql.org/docs/8.1/interactive/ddl-partitioning.html
Intriguing idea, and one that I might do in the future. However, I have 170
normalized tables defined at present. The task of auditing each of these
tables for the constraint ranges sounds somewhat nightmarish. Not all the
tables have the cust_id record defined - some are implicit.
> >> Given the same physical hardware, which one is likely to perform better?
Does
> >> it make any difference? Does using separate databases use more RAM than a
> >> single database with a bunch of different tables?
>
> Config files are global, so I doubt it.
>
> >> Company is growing rapidly, so growth room is important...
>
> Then go for Option 1.
My conclusion, too. Another poster mentioned schema drift, and that's a real
concern, but we're already updating the schema through a script which could
be extended to update all databases, not just one, so I feel this problem
would be minor to non-existent.
Thanks!
-Ben
--
"The best way to predict the future is to invent it."
- XEROX PARC slogan, circa 1978
From | Date | Subject | |
---|---|---|---|
Next Message | Joshua D. Drake | 2006-12-21 22:41:35 | Re: Partitioning Vs. Split Databases - performance? |
Previous Message | Walter Vaughan | 2006-12-21 21:58:58 | Re: Website Hosting Service and Security |