Re: Amazon High I/O instances

From: Sébastien Lorion <sl(at)thestrangefactory(dot)com>
To: John R Pierce <pierce(at)hogranch(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Amazon High I/O instances
Date: 2012-09-13 00:25:29
Message-ID: CAGa5y0NGGxj5i-6YywfqLCWumB-=b7vNVCuJAMpza8wGWzgtHw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

The DB back-end of my application has 2 use cases:

- a normalized master DB, sharded by userid (based on their activity, not a
formula such as modulo, because some users can be 1-2 order of magnitude
more active than others)

- many denormalized read-only slaves, with some different models depending
on what kind of queries they are serving

All requests are queued in RabbitMQ, and most writes are fire-and-forget,
with calculations done on the client, assuming the write worked fine and UI
refreshed at most a couple of seconds later with real values.

At the moment, I have total 73 tables, 432 columns and 123 relations, so
not overly complex, but not a key-value store either .. Most queries are
localized to sub-system which on average have 5-10 tables.

So far, about 10% of traffic hit the master DB (a lot I know, this an
interactive application), which is the one that really concerns me. Users
make an average of 1 write request every 5 sec., so with say, with 100,000
concurrent users, that makes 20,000 tx/s. That said, that number could grow
overnight and I do not want to be one more startup that redo his system
under the worst of conditions as I read too often about. I got time to at
least prepare a bit, without overdoing it, of course..

Sébastien

On Wed, Sep 12, 2012 at 7:24 PM, John R Pierce <pierce(at)hogranch(dot)com> wrote:

> On 09/12/12 4:03 PM, Sébastien Lorion wrote:
>
>> I agree 1GB is a lot, I played around with that value, but it hardly
>> makes a difference. Is there a plateau in how that value affects query
>> performance ? On a master DB, I would set it low and raise as necessary,
>> but what would be a good average value on a read-only DB with same spec and
>> max_connections ?
>>
>
> a complex query can require several times work_mem for sorts and hash
> merges. how many queries do you expect to ever be executing
> concurrently? I'll take 25% of my system memory and divide it by
> 'max_connections' and use that as work_mem for most cases.
>
> on a large memory system doing dedicated transaction processing, I
> generally shoot for about 50% of the server memory as disk cache, 1-2GB as
> shared_buffers, 512MB-2GB as maintenance_work_mem, and 20-25% as work_mem
> (divided by max_connections)
>
>
>
>
> --
> john r pierce N 37, W 122
> santa cruz ca mid-left coast
>
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/**mailpref/pgsql-general<http://www.postgresql.org/mailpref/pgsql-general>
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Sébastien Lorion 2012-09-13 00:28:13 Re: Amazon High I/O instances
Previous Message John R Pierce 2012-09-13 00:07:06 Re: Tables with lots of dead tuples despite autovacuum