From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | Dev Nop <devnop0(at)gmail(dot)com> |
Cc: | "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Storing large documents - one table or partition by doc? |
Date: | 2016-09-26 00:20:14 |
Message-ID: | CAMkU=1ze4PZrgv6YQWzMMZpd7mi12K_bqNWVmkvZDYin5MOSWw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Fri, Sep 23, 2016 at 3:12 AM, Dev Nop <devnop0(at)gmail(dot)com> wrote:
> I’m storing thousands of independent documents each containing around 20k
> rows. The larger the document, the more likely it is to be active with
> inserts and updates (1000s/day). The most common read query is to get all
> the rows for a single document (100s/day).
>
How can the query be an order of magnitude less than the writes? Wouldn't
anything doing an insert or update want to see the results of other
people's inserts/updates about as frequently as they happen?
> It will be supporting real-time collaboration but with strong-consistency
> for a simple schema so not well-suited to dedicated "document databases"
> that assume schema-less & eventual consistency. I won’t have great
> hardware/budget so need to squeeze the most out of the least.
>
> My question is whether to put all documents into a single huge table or
> partition by document?
>
> The documents are independent so its purely a performance question. Its
> too many tables for postgresql partitioning support but I don’t get any
> benefit from a master table and constraints. Handling partitioning in
> application logic is effectively zero cost.
>
> I know that 1000s of tables is regarded as an anti-pattern but I can only
> see the performance and maintenance benefits of one table per independent
> document e.g. fast per-table vacuum, incremental schema updates, easy
> future sharding. A monster table will require additional key columns and
> indexes that don’t have any value beyond allowing the documents to sit in
> the same table.
>
If you go the partitioned route, I would add the extra column anyway (but
not an index on it), so that it is there if/when you need it.
>
> The only downsides seem to be the system level per-table overhead but I
> only see that as a problem if I have a very long tail of tiny documents.
> I'd rather solve that problem if it occurs than manage an
> all-eggs-in-one-basket monster table.
>
> Is there anything significant I am missing in my reasoning?
>
If you use a reasonably modern version of PostgreSQL (say, >=9.4) , the
overhead of having 1000s of tables should not be too large of a problem.
When get into the 100,000 range, that it is likely to start being a
problem. If you get to 1,000,000, you almost definitely have a problem.
Cheers,
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Spiegelberg | 2016-09-26 02:50:09 | Millions of tables |
Previous Message | Jim Nasby | 2016-09-24 21:30:06 | Re: Storing large documents - one table or partition by doc? |