Re: One huge db vs many small dbs

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Max <maxabbr(at)yahoo(dot)com(dot)br>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: One huge db vs many small dbs
Date: 2013-12-06 00:01:33
Message-ID: 52A113DD.1050809@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 12/05/2013 02:42 AM, Max wrote:
> Hello,
>
> We are starting a new project to deploy a solution in cloud with the possibility to be used for 2.000+ clients. Each of this clients will use several tables to store their information (our model has about 500+ tables but there's less than 100 core table with heavy use). Also the projected ammout of information per client could be from small (few hundreds tuples/MB) to huge (few millions tuples/GB).
>
> One of the many questions we have is about performance of the db if we work with only one (using a ClientID to separete de clients info) or thousands of separate dbs. The management of the dbs is not a huge concert as we have an automated tool.

In addition to the excellent advice from others, I'll speak from experience:

The best model here, if you can implement it, is to implement shared
tables for all customers, but have a way you can "break out" customers
to their own database(s). This allows you to start with a single
database, but to shard out your larger customers as they grow. The
small customers will always stay on the same DB.

That means you'll also treat the different customers as different DB
connections from day 1. That way, when you move the large customers out
to separate servers, you don't have to change the way the app connects
to the database.

If you can't implement shared tables, I'm going to say go for separate
databases. This will mean lots of additional storage space -- the
per-DB overhead by itself will be 100GB -- but otherwise you'll be
grappling with the issues involved in having a million tables, which Joe
Conway outlined. But if you don't have shared tables, your huge schema
is always going to cause you to waste resources on the smaller customers.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message jacket41142 2013-12-06 02:10:26 select count(distinct ...) is slower than select distinct in about 5x
Previous Message Mark Kirkwood 2013-12-05 22:01:00 Re: WAL + SSD = slow inserts?