From: | Greg Spiegelberg <gspiegelberg(at)gmail(dot)com> |
---|---|
To: | Mike Sofen <msofen(at)runbox(dot)com> |
Cc: | "pgsql-performa(dot)" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Millions of tables |
Date: | 2016-09-27 15:49:49 |
Message-ID: | CAEtnbpU23--36ea-HoECUBysvtYOMCobd9wEY6b2bL88ugOTXg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Tue, Sep 27, 2016 at 9:42 AM, Mike Sofen <msofen(at)runbox(dot)com> wrote:
> *From:* Mike Sofen *Sent:* Tuesday, September 27, 2016 8:10 AM
>
> *From:* Greg Spiegelberg *Sent:* Monday, September 26, 2016 7:25 AM
> I've gotten more responses than anticipated and have answered some
> questions and gotten some insight but my challenge again is what should I
> capture along the way to prove or disprove this storage pattern?
> Alternatives to the storage pattern aside, I need ideas to test rig,
> capture metrics and suggestions to tune it.
>
>
>
> In the next 24 hours, I will be sending ~1 trillion records to the test
> database. Because of time to set up, I'd rather have things set up
> properly the first go.
>
>
>
> Thanks!
>
> -Greg
>
> ---------------------
>
> Greg, I ran another quick test on a wider table than you’ve described, but
> this time with 80 million rows, with core counts, ram and ssd storage
> similar to what you’d have on that AWS EC2 instance. This table had 7
> columns (3 integers, 3 text, 1 timestamptz) with an average width of 157
> chars, one btree index on the pk int column. Using explain analyze, I
> picked one id value out of the 80m and ran a select * where id = x. It did
> an index scan, had a planning time of 0.077ms, and an execution time of
> 0.254 seconds. I ran the query for a variety of widely spaced values (so
> the data was uncached) and the timing never changed. This has been
> mirroring my general experience with PG – very fast reads on indexed
> queries.
>
>
>
> Summary: I think your buckets can be WAY bigger than you are envisioning
> for the simple table design you’ve described. I’m betting you can easily
> do 500 million rows per bucket before approaching anything close to the
> 30ms max query time.
>
>
>
> Mike Sofen (Synthetic Genomics)
>
>
>
> Totally typo’d the execution time: it was 0.254 MILLISECONDS, not
> SECONDS. Thus my comment about going up 10x in bucket size instead of
> appearing to be right at the limit. Sorry!
>
>
>
I figured. :)
Haven't ruled it out but expectations of this implementation is to perform
at worst 3X slower than memcache or Redis.
Bigger buckets mean a wider possibility of response times. Some buckets
may contain 140k records and some 100X more.
-Greg
From | Date | Subject | |
---|---|---|---|
Next Message | Torsten Zuehlsdorff | 2016-09-27 16:00:02 | Re: PostgreSQL on ZFS: performance tuning |
Previous Message | Greg Spiegelberg | 2016-09-27 15:46:05 | Re: Millions of tables |