Re: Table partitioning for cloud service?

From: Christopher Browne <cbbrowne(at)gmail(dot)com>
To: Israel Brewster <israel(at)brewstersoft(dot)com>
Cc: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, PostgreSQL Mailing Lists <pgsql-general(at)postgresql(dot)org>
Subject: Re: Table partitioning for cloud service?
Date: 2020-05-21 16:13:34
Message-ID: CAFNqd5XP0rZQoV4s-shdgunjhUur1pxc2HwJFM=4USAguMNpQw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, 21 May 2020 at 11:53, Israel Brewster <israel(at)brewstersoft(dot)com>
wrote:

> On May 21, 2020, at 7:36 AM, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
> wrote:
>
> On 5/21/20 8:29 AM, Israel Brewster wrote:
>
> I’m working on my first cloud service, which will be backed by a
> postgresql database. Currently I only have a single customer, but of course
> I want to design with the possibility of multiple customers in mind. In
> that vein, I’m wondering what is “typical” in terms of designing the DB
> structure to make sure that one customer doesn’t “accidentally" get data
> for another customer? At the moment I am leaning towards giving each
> customer their own set of tables, with a unique prefix for each. This would
> provide a “hard” separation for the data,
>
> ^^^^^^^^^^^^ I think that is called a schema:)
>
>
> Ok. That’s probably an option. Although it looks from a cursory perusal
> that for that to work, I would have to have separate DB users with
> different permissions. Which would be fine, except that I don’t have
> permissions to create users.
>
> Or set up a separate database for each in the cluster.
>
>
> Same as above - no permissions.
>
> At the moment, I am running on Heroku, which gives me a postgresql
> database, but not full control over the cluster. I may need to move to
> something more robust, if having completely separate databases is the best
> option. I was hoping to avoid SysAdmin stuff as much as possible, and focus
> on the software side, but obviously some sys admin is required.
>

There's a whole lot of "that depends" to this.

If there is not much data shared across customers, then it's a pretty good
answer to create a database for each one. This is especially good if they
are only occasionally connected.

If there is a LOT of shared data, then "life gets more complicated."

It's a decently well documented problem out there; I just searched for
"multitenant database design" which showed up a number of decent
(not-Postgres-specific) answers

https://www.google.com/search?client=firefox-b-d&q=multitenant+database+design

Some common patterns include:
- A database per tenant (on Postgres, that means that PGDATABASE and/or
connection URIs change for each tenant)
- A schema per tenant (on Postgres, that means each time a tenant is added,
you need "CREATE NAMESPACE" to establish the tenancy and "CREATE TABLE" for
each table in that tenancy, and connections use "set
search_path=tenantname;" to select data from the right tenant)
- Table-based tenancy (e.g. - each table has a "tenant_id" and queries need
to specify the tenant)

An interesting answer, if there needs to be shared data, is for the shared
data to go in its own database, and use a Foreign Data Wrapper to have each
tenants' database access it <
https://www.postgresql.org/docs/12/postgres-fdw.html>

There are lots of tradeoffs involved in each case; each of the above
patterns has merits and demerits particularly as the number of tenants
scales, as well as when you discover there are both tiny and large tenants
with differing requirements.

You need to look at it from various perspectives:
- How do application schema changes get handled as the application evolves?
- What are the security concerns about data sharing across tenants?
- What issues come up when managing storage across tenants? (Some
approaches are easier to cope with than others)

If you don't have a fair bit of technical expertise locally, then
sophisticated choices will cause you problems that you won't be able to
solve.
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Israel Brewster 2020-05-21 16:17:20 Re: Table partitioning for cloud service?
Previous Message Adrian Klaver 2020-05-21 15:57:50 Re: Table partitioning for cloud service?