From: | Igor Chudov <ichudov(at)gmail(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Postgres for a "data warehouse", 5-10 TB |
Date: | 2011-09-11 13:59:16 |
Message-ID: | CAMhtkAah2c4XfSec=OtgL1V51wpD=jygbZnBVBRYCV02MscebQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Sun, Sep 11, 2011 at 7:52 AM, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>wrote:
> On Sun, Sep 11, 2011 at 6:35 AM, Igor Chudov <ichudov(at)gmail(dot)com> wrote:
> > I have a server with about 18 TB of storage and 48 GB of RAM, and 12
> > CPU cores.
>
> 1 or 2 fast cores is plenty for what you're doing.
I need those cores to perform other tasks, like image manipulation with
imagemagick, XML forming and parsing etc.
> But the drive
> array and how it's configured etc are very important. There's a huge
> difference between 10 2TB 7200RPM SATA drives in a software RAID-5 and
> 36 500G 15kRPM SAS drives in a RAID-10 (SW or HW would both be ok for
> data warehouse.)
Well, right now, my server has twelve 7,200 RPM 2TB hard drives in a RAID-6
configuration.
They are managed by a 3WARE 9750 RAID CARD.
I would say that I am not very concerned with linear relationship of read
speed to disk speed. If that stuff is somewhat slow, it is OK with me.
What I want to avoid is severe degradation of performance due to size (time
complexity greater than O(1)), disastrous REPAIR TABLE operations etc.
> I do not know much about Postgres, but I am very eager to learn and
> > see if I can use it for my purposes more effectively than MySQL.
> > I cannot shell out $47,000 per CPU for Oracle for this project.
> > To be more specific, the batch queries that I would do, I hope,
>
> Hopefully if needs be you can spend some small percentage of that for
> a fast IO subsystem is needed.
>
>
I am actually open for suggestions here.
> > would either use small JOINS of a small dataset to a large dataset, or
> > just SELECTS from one big table.
> > So... Can Postgres support a 5-10 TB database with the use pattern
> > stated above?
>
> I use it on a ~3TB DB and it works well enough. Fast IO is the key
> here. Lots of drives in RAID-10 or HW RAID-6 if you don't do a lot of
> random writing.
>
I do not plan to do a lot of random writing. My current design is that my
perl scripts write to a temporary table every week, and then I do INSERT..ON
DUPLICATE KEY UPDATE.
By the way, does that INSERT UPDATE functionality or something like this
exist in Postgres?
i
From | Date | Subject | |
---|---|---|---|
Next Message | Igor Chudov | 2011-09-11 14:00:36 | Re: Postgres for a "data warehouse", 5-10 TB |
Previous Message | pasman pasmański | 2011-09-11 13:36:41 | Re: Postgres for a "data warehouse", 5-10 TB |