Re: scaling postgres

From: Jony Cohen <jony(dot)cohenjo(at)gmail(dot)com>
To: Chris Withers <chris(at)simplistix(dot)co(dot)uk>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: scaling postgres
Date: 2015-08-03 07:15:02
Message-ID: CAD9xk1_1iEACYyXfLWGNv7uUFzLFaz6A3XczV8HSUY-2j0zXJQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,
I'd recommend looking into 2 solutions here, depending if you want to keep
the data or not and how "fault tolerant" is the app (could you loose data
from the last hour?)
If you could loose some data and you don't intend keeping the whole data
set - I'd look at using RAM as your storage - it's the fastest place to
store data and you can easily get servers with lot's of RAM these days.
If you can't loose data then you'll need to use disks - depending on how
big each row is, compute your expected read/write throughput and go
shopping :)
for 1kb rows you get 60GB per hour = 16MB per sec - simple disks can handle
this.
for 10kb rows you get 160MB/s - now you need better disks :)

SSD disks are cheep these days but they don't like repeated writes/deletes
so it might cause problems down the line (hence my first RAM recommendation)

as for keeping the raw data - you could easily do it if you use partitions,
if you have daily partitions inheriting from a master table you can quickly
access the last day (or even several days)
but do take note that a full table scan takes time and you'll need to
manage writing to the right partition yourself (not that hard - simply
insert to <tablename>_<date>)

if you can write the data daily, keeping it will not add any real load (a
little on parse times for queries that access the master table)

Just my 2 cents,
Regards,
- Jony

On Mon, Aug 3, 2015 at 9:53 AM, Chris Withers <chris(at)simplistix(dot)co(dot)uk>
wrote:

> Hi All,
>
> I'm curious if there are recommendations for scaling postgres to what, for
> me, seems like "a lot" of data...
>
> The app in question currently writes around 1.5 billion rows into a table
> before rolling them up into tables that have a few million roll up rows
> each. That 1.5 billion row table is emptied and refilled each day, so we're
> talking about quite high write as well as quite high read. Where can I find
> could examples/docs of how to scale postgres for this kind of data load?
> What sort of hardware would I be looking to spec?
>
> Okay, now this app may well eventually want to progress to storing those
> 1.5 billion rows per day. Is that feasible with postgres? If not, what
> storage and processing solutions would people recommend for that kind of
> data load?
>
> cheers,
>
> Chris
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Chris Withers 2015-08-03 07:22:15 Re: scaling postgres
Previous Message Chris Withers 2015-08-03 06:53:29 scaling postgres