Re: Millions of tables

From: Greg Spiegelberg <gspiegelberg(at)gmail(dot)com>
To: Craig James <cjames(at)emolecules(dot)com>
Cc: "pgsql-performa(dot)" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Millions of tables
Date: 2016-09-27 15:46:05
Message-ID: CAEtnbpV49mB19Gq3YG6UnFE9wSBA8bRQBaOjui5nHP2eMpCj4A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, Sep 27, 2016 at 8:30 AM, Craig James <cjames(at)emolecules(dot)com> wrote:

> On Sun, Sep 25, 2016 at 7:50 PM, Greg Spiegelberg <gspiegelberg(at)gmail(dot)com>
> wrote:
>
>> Hey all,
>>
>> Obviously everyone who's been in PostgreSQL or almost any RDBMS for a
>> time has said not to have millions of tables. I too have long believed it
>> until recently.
>>
>> AWS d2.8xlarge instance with 9.5 is my test rig using XFS on EBS (io1)
>> for PGDATA. Over the weekend, I created 8M tables with 16M indexes on
>> those tables. Table creation initially took 0.018031 secs, average
>> 0.027467 and after tossing out outliers (qty 5) the maximum creation time
>> found was 0.66139 seconds. Total time 30 hours, 31 minutes and 8.435049
>> seconds. Tables were created by a single process. Do note that table
>> creation is done via plpgsql function as there are other housekeeping tasks
>> necessary though minimal.
>>
>> No system tuning but here is a list of PostgreSQL knobs and switches:
>> shared_buffers = 2GB
>> work_mem = 48 MB
>> max_stack_depth = 4 MB
>> synchronous_commit = off
>> effective_cache_size = 200 GB
>> pg_xlog is on it's own file system
>>
>> There are some still obvious problems. General DBA functions such as
>> VACUUM and ANALYZE should not be done. Each will run forever and cause
>> much grief. Backups are problematic in the traditional pg_dump and PITR
>> space. Large JOIN's by VIEW, SELECT or via table inheritance (I am abusing
>> it in my test case) are no-no's. A system or database crash could take
>> potentially hours to days to recover. There are likely other issues ahead.
>>
>> You may wonder, "why is Greg attempting such a thing?" I looked at
>> DynamoDB, BigTable, and Cassandra. I like Greenplum but, let's face it,
>> it's antiquated and don't get me started on "Hadoop". I looked at many
>> others and ultimately the recommended use of each vendor was to have one
>> table for all data. That overcomes the millions of tables problem, right?
>>
>> Problem with the "one big table" solution is I anticipate 1,200 trillion
>> records. Random access is expected and the customer expects <30ms reads
>> for a single record fetch.
>>
>
> You don't give enough details to fully explain the problem you're trying
> to solve.
>
> - Will records ever be updated or deleted? If so, what percentage and
> at what frequency?
> - What specifically are you storing (e.g. list of integers, strings,
> people's sex habits, ...)? Or more importantly, are these fixed- or
> variable-sized records?
> - Once the 1,200 trillion records are loaded, is that it? Or do more
> data arrive, and if so, at what rate?
> - Do your queries change, or is there a fixed set of queries?
> - How complex are the joins?
>
> Excellent questions.

1a. Half of the 4M tables will contain ~140k records and UPDATE's will
occur on roughly 100 records/day/table. No DELETE's on this first half.
1b. Second half of the 4M tables will contain ~200k records. Zero UPDATE's
however DELETE's will occur on ~100 records/day/table.

2. All 4M tables contain 7 columns: (4) bigints, (2) timestamptz and (1)
boolean. 2M of the table will have an PKEY on (1) bigint table only.
Second 2M table have a PKEY on (bigint,timestamptz) and two additional
indexes on (bigint, timestamptz) different columns.

3. The trillions-of-records load is just to push the system to find the
maximum record load capability. Reality, 200M records / day or
~2,300/second average is the expectation once in production.

4. Queries are fixed and match the indexes laid down on the tables. Goal
is <30ms/query. I have attempted queries with and without indexes.
Without indexes the average query response varied between 20ms and 40ms
whereas indexes respond within a much tighter range of 5ms to 9ms. Both
query performance tests were done during data-ingest.

5. Zero JOIN's and I won't let it ever happen. However the 4M tables
INHERIT a data grouping table. Test rig limits child tables to
1,000/parent. This was done to explore some other possible access patterns
but they are secondary and if it doesn't work then either a) the
requirement will be dropped or b) I may look at storing the data in the
1,000 child tables directly in the parent table and I'll need to re-run
load & read tests.

> The reason I ask these specific questions is because, as others have
> pointed out, this might be a perfect case for a custom (non-relational)
> database. Relational databases are general-purpose tools, sort of like a
> Swiss-Army knife. A Swiss-Army knife does most things passably, but if you
> want to carve wood, or butcher meat, or slice vegetables, you get a knife
> meant for that specific task.
>
>
I'm at a sizing phase. If 4M tables works I'll attempt 16M tables. If it
points to only 2M or 1M then that's fine. The 4M table database in only a
single cog in the storage service design. Anticipating ~40 of these
databases but it is dependent upon how many tables work in a single
instance.

The 4M tables are strict relationship tables referencing two other tables
containing a JSONB column in each. The PostgreSQL JSONB function and
operator set facilitates current needs beautifully and leaves much room for
future use cases. Using other technologies really limits the query search
and storage capabilities.

I've explored many other technologies and the possibility of using
PostgreSQL for the 2 tables with JSONB and relationships elsewhere however
I foresee too many complexities and possible problems. I am confident in
PostgreSQL and the implementation but, as I said, I need to understand the
size limits.

I mentioned the 2 tables with JSONB so I'll elaborate a little more on
query patterns. Every query performs 3 SELECT's.
1. SELECT on JSONB table #1 (~140k records total) searching for records
matching a JSONB literal (most common use) or pattern. Returns id1.
2. SELECT on known table from the 4M using id1 from step 1 returned id2.
3. SELECT on JSONB table #2 (~500k to 90M records) search for record match
id2 returned in step 2.

> I've written several custom database-storage systems for very specific
> high-performance systems. It's generally a couple weeks of work, and you
> have a tailored performance and storage that's hard for a general-purpose
> relational system to match.
>
>
Yeah, we're kinda beyond the write-it-yourself because of the need to
maintain-it-yourself. :)

Hope some of these answers helped.

-Greg

> The difficulty of building such a system depends a lot on the answers to
> the questions above.
>
> Craig
>
>
>> No data is loaded... yet Table and index creation only. I am interested
>> in the opinions of all including tests I may perform. If you had this
>> setup, what would you capture / analyze? I have a job running preparing
>> data. I did this on a much smaller scale (50k tables) and data load via
>> function allowed close to 6,000 records/second. The schema has been
>> simplified since and last test reach just over 20,000 records/second with
>> 300k tables.
>>
>> I'm not looking for alternatives yet but input to my test. Takers?
>>
>> I can't promise immediate feedback but will do my best to respond with
>> results.
>>
>> TIA,
>> -Greg
>>
>
>
>
> --
> ---------------------------------
> Craig A. James
> Chief Technology Officer
> eMolecules, Inc.
> ---------------------------------
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Greg Spiegelberg 2016-09-27 15:49:49 Re: Millions of tables
Previous Message Mike Sofen 2016-09-27 15:42:44 Re: Millions of tables