Re: Millions of tables

From: Greg Spiegelberg <gspiegelberg(at)gmail(dot)com>
To: Rick Otten <rottenwindfish(at)gmail(dot)com>
Cc: "pgsql-performa(dot)" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Millions of tables
Date: 2016-09-26 14:09:04
Message-ID: CAEtnbpULv+AHYKT1jakKBhsCvxmwffXVYeg5xnMWS=5j+xnHMA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Sep 26, 2016 at 4:23 AM, Rick Otten <rottenwindfish(at)gmail(dot)com>
wrote:

> Are the tables constantly being written to, or is this a mostly read
> scenario? One architecture possibility, if the writes are not so
> frequent, is to create just a handful of very big tables for writing, and
> then make smaller tables as materialized views for reading. The vacuum and
> bloat management could be done back a the big tables. The materialized
> views could be refreshed or replaced during non-peak hours. The
> materialized views could be on a different tablespace than the root
> tables. They could also be structured to reflect real-world query patterns
> which are sometimes different than the raw data storage engineering problem.
>
>
Like any data warehouse, I expect 90%+ of the activity being writes but the
necessity of low latency reads is an absolute must else the design doesn't
get off the ground.

Materialized views are neat for many cases but not this one. Current
versions of the data must be available the moment after they are written.

I am considering tablespaces however I have no way to properly size. One
group of tables may contain 1M records and another 100M. Could be on the
same file system but I'd have the same problems internal to PostgreSQL and
the only thing overcome is millions of files in a single directory which is
a file system selection problem.

> With some logging you may be able to see that the data is not truly
> randomly accessed, but rather clustered around just some of the millions of
> tables. Then the engineering problem becomes "How do I service 90% of the
> queries on these tables in 30ms ?" Rather than "How do I service 100% of
> the queries 100% of the time in 30ms?" Knowing 90% of the queries hit just
> a few hundred tables, makes the first question easier to answer.
>
> Similarly, if most of the columns are static and only a few columns are
> actually changing, you could consider pulling the static stuff out of the
> same table with the dynamic stuff and then look at joins in your queries.
> The end goal is to be able to get solid indexes and tables that don't
> change a lot so they can be tightly packed and cached. (less bloat, less
> fragmentation, fewer disk accesses).
>
> With regards to consistent query performance, I think you need to get out
> of AWS. That environment is terrible if you are going for consistency
> unless you buy dedicated hardware, and then you are paying so much money it
> is ridiculous.
>
>
True about AWS and though it is possible hardware may be purchased AWS is
the right place to start. 1) AWS is not IT and won't take months to
approve budget for gear+deployment and more importantly 2) still in design
phase and if deployed there is no way to predict true adoption meaning
it'll start small. AWS is the right place for now.

> Also I think having 10M rows in a table is not a problem for the query
> times you are referring to. So instead of millions of tables, unless I'm
> doing my math wrong, you probably only need thousands of tables.
>
>
>
> On Mon, Sep 26, 2016 at 5:43 AM, Stuart Bishop <stuart(at)stuartbishop(dot)net>
> wrote:
>
>> On 26 September 2016 at 11:19, Greg Spiegelberg <gspiegelberg(at)gmail(dot)com>
>> wrote:
>>
>>> I did look at PostgresXL and CitusDB. Both are admirable however
>>> neither could support the need to read a random record consistently under
>>> 30ms. It's a similar problem Cassandra and others have: network latency.
>>> At this scale, to provide the ability to access any given record amongst
>>> trillions it is imperative to know precisely where it is stored (system &
>>> database) and read a relatively small index. I have other requirements
>>> that prohibit use of any technology that is eventually consistent.
>>>
>>> I liken the problem to fishing. To find a particular fish of length,
>>> size, color &c in a data lake you must accept the possibility of scanning
>>> the entire lake. However, if all fish were in barrels where each barrel
>>> had a particular kind of fish of specific length, size, color &c then the
>>> problem is far simpler.
>>>
>>> -Greg
>>>
>>
>> My gut tells me that if you do solve the problem and get PostgreSQL (or
>> anything) reading consistently at under 30ms with that many tables you will
>> have solved one problem by creating another.
>>
>> You discounted Cassandra due to network latency, but are now trying a
>> monolithic PostgreSQL setup. It might be worth trying a single node
>> ScyllaDB or Cassandra deploy (no need for QUORUM or network overhead),
>> perhaps using layered compaction so all your data gets broken out into
>> 160MB chunks. And certainly wander over to the ScyllaDB mailing list, as
>> they are very focused on performance problems like yours and should offer
>> some insight even if a Cassandra style architecture cannot meet your
>> requirements.
>>
>> An alternative if you exhaust or don't trust other options, use a foreign
>> data wrapper to access your own custom storage. A single table at the PG
>> level, you can shard the data yourself into 8 bazillion separate stores, in
>> whatever structure suites your read and write operations (maybe reusing an
>> embedded db engine, ordered flat file+log+index, whatever).
>>
>> --
>> Stuart Bishop <stuart(at)stuartbishop(dot)net>
>> http://www.stuartbishop.net/
>>
>
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Stuart Bishop 2016-09-26 14:21:22 Re: Millions of tables
Previous Message Greg Spiegelberg 2016-09-26 13:57:11 Re: Millions of tables