From: | Sébastien Lorion <sl(at)thestrangefactory(dot)com> |
---|---|
To: | John R Pierce <pierce(at)hogranch(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Amazon High I/O instances |
Date: | 2012-09-13 00:25:29 |
Message-ID: | CAGa5y0NGGxj5i-6YywfqLCWumB-=b7vNVCuJAMpza8wGWzgtHw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
The DB back-end of my application has 2 use cases:
- a normalized master DB, sharded by userid (based on their activity, not a
formula such as modulo, because some users can be 1-2 order of magnitude
more active than others)
- many denormalized read-only slaves, with some different models depending
on what kind of queries they are serving
All requests are queued in RabbitMQ, and most writes are fire-and-forget,
with calculations done on the client, assuming the write worked fine and UI
refreshed at most a couple of seconds later with real values.
At the moment, I have total 73 tables, 432 columns and 123 relations, so
not overly complex, but not a key-value store either .. Most queries are
localized to sub-system which on average have 5-10 tables.
So far, about 10% of traffic hit the master DB (a lot I know, this an
interactive application), which is the one that really concerns me. Users
make an average of 1 write request every 5 sec., so with say, with 100,000
concurrent users, that makes 20,000 tx/s. That said, that number could grow
overnight and I do not want to be one more startup that redo his system
under the worst of conditions as I read too often about. I got time to at
least prepare a bit, without overdoing it, of course..
Sébastien
On Wed, Sep 12, 2012 at 7:24 PM, John R Pierce <pierce(at)hogranch(dot)com> wrote:
> On 09/12/12 4:03 PM, Sébastien Lorion wrote:
>
>> I agree 1GB is a lot, I played around with that value, but it hardly
>> makes a difference. Is there a plateau in how that value affects query
>> performance ? On a master DB, I would set it low and raise as necessary,
>> but what would be a good average value on a read-only DB with same spec and
>> max_connections ?
>>
>
> a complex query can require several times work_mem for sorts and hash
> merges. how many queries do you expect to ever be executing
> concurrently? I'll take 25% of my system memory and divide it by
> 'max_connections' and use that as work_mem for most cases.
>
> on a large memory system doing dedicated transaction processing, I
> generally shoot for about 50% of the server memory as disk cache, 1-2GB as
> shared_buffers, 512MB-2GB as maintenance_work_mem, and 20-25% as work_mem
> (divided by max_connections)
>
>
>
>
> --
> john r pierce N 37, W 122
> santa cruz ca mid-left coast
>
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/**mailpref/pgsql-general<http://www.postgresql.org/mailpref/pgsql-general>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Sébastien Lorion | 2012-09-13 00:28:13 | Re: Amazon High I/O instances |
Previous Message | John R Pierce | 2012-09-13 00:07:06 | Re: Tables with lots of dead tuples despite autovacuum |