Re: Millions of tables

From: Greg Spiegelberg <gspiegelberg(at)gmail(dot)com>
To: Mike Sofen <msofen(at)runbox(dot)com>
Cc: "pgsql-performa(dot)" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Millions of tables
Date: 2016-09-27 15:49:49
Message-ID: CAEtnbpU23--36ea-HoECUBysvtYOMCobd9wEY6b2bL88ugOTXg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, Sep 27, 2016 at 9:42 AM, Mike Sofen <msofen(at)runbox(dot)com> wrote:

> *From:* Mike Sofen *Sent:* Tuesday, September 27, 2016 8:10 AM
>
> *From:* Greg Spiegelberg *Sent:* Monday, September 26, 2016 7:25 AM
> I've gotten more responses than anticipated and have answered some
> questions and gotten some insight but my challenge again is what should I
> capture along the way to prove or disprove this storage pattern?
> Alternatives to the storage pattern aside, I need ideas to test rig,
> capture metrics and suggestions to tune it.
>
>
>
> In the next 24 hours, I will be sending ~1 trillion records to the test
> database. Because of time to set up, I'd rather have things set up
> properly the first go.
>
>
>
> Thanks!
>
> -Greg
>
> ---------------------
>
> Greg, I ran another quick test on a wider table than you’ve described, but
> this time with 80 million rows, with core counts, ram and ssd storage
> similar to what you’d have on that AWS EC2 instance. This table had 7
> columns (3 integers, 3 text, 1 timestamptz) with an average width of 157
> chars, one btree index on the pk int column. Using explain analyze, I
> picked one id value out of the 80m and ran a select * where id = x. It did
> an index scan, had a planning time of 0.077ms, and an execution time of
> 0.254 seconds. I ran the query for a variety of widely spaced values (so
> the data was uncached) and the timing never changed. This has been
> mirroring my general experience with PG – very fast reads on indexed
> queries.
>
>
>
> Summary: I think your buckets can be WAY bigger than you are envisioning
> for the simple table design you’ve described. I’m betting you can easily
> do 500 million rows per bucket before approaching anything close to the
> 30ms max query time.
>
>
>
> Mike Sofen (Synthetic Genomics)
>
>
>
> Totally typo’d the execution time: it was 0.254 MILLISECONDS, not
> SECONDS. Thus my comment about going up 10x in bucket size instead of
> appearing to be right at the limit. Sorry!
>
>
>
I figured. :)

Haven't ruled it out but expectations of this implementation is to perform
at worst 3X slower than memcache or Redis.

Bigger buckets mean a wider possibility of response times. Some buckets
may contain 140k records and some 100X more.

-Greg

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Torsten Zuehlsdorff 2016-09-27 16:00:02 Re: PostgreSQL on ZFS: performance tuning
Previous Message Greg Spiegelberg 2016-09-27 15:46:05 Re: Millions of tables