Re: Postgresql Hardware - Recommendations

From: Matthew Nuzum <mattnuzum(at)gmail(dot)com>
To: "Christian(dot)Kastner(at)gutmann(dot)at" <Christian(dot)Kastner(at)gutmann(dot)at>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Postgresql Hardware - Recommendations
Date: 2005-09-06 15:09:10
Message-ID: f3c0b408050906080913166703@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 9/5/05, Christian(dot)Kastner(at)gutmann(dot)at <Christian(dot)Kastner(at)gutmann(dot)at> wrote:
> ... The only info I can
> give so far is that the database size is about 60GB, and that it will be
> frequently accessed by multiple users (about 100 will be connected
> during business hours). The applications accessing the database are
> mostly reporting tools.

Optimizing hardware for mostly selects is different than optimizing
for lots of inserts. You will get good responses from this list if you
can give a little more details. Here are some questions:
How do you get your data into the db? Do you do bullk loads at
periodic intervals during the day? Do you do frequent updates/inserts?

You say reporting, do you use many stored procedures and calculations
on the server side? I've used some reporting apps that simply grab
tons of data from the server and then process it on the client side
(ODBC apps seem to do this), while other applications formulate the
queries and use stored procedures in order to transfer little data.

Of your 60GB, how much of that is active? Does your budget allow you
to buy enough RAM to get your active data into the disk cache? For
reporting, this *might* be your biggest win.

Here are some scenarios:
S1: Bulk uploads once or twice daily of about 250 MB of data. Few
inserts and updates during the day (1-2%). Reporting is largely done
on data from the last 5 business days. In this case you have < 2GB of
active data and your disk cache will hold all of your active data in
RAM (provided your db structure is diskcache friendly). An example of
this I have experienced is a sales application that queries current
inventory. Telephone agents queried, quieried, queried the
instock-inventory.

S2: Same as above but reporting is largely done on data covering 200+
business days. Its doubtful that you will get 50GB of RAM in your
server, you need to focus on disk speed. An example of this I have
experienced was an application that looked at sales trends and
performed commission calculations and projected sales forecasts.

S3: Lots of inserts/updates throughout the day (15 - 25%) - you need
to focus on disk speed. The content management system my employer
develops fits this model.

> 3) CPUs vs Memory
> Would you rather have 4x CPUs and 8GB of memory, or 2x CPUs with 16GB of
> memory?

Very hard to say without knowing your application. I have limited
experience but what I've found is that applications that support
multiple db architectures do not fully utilize the database server and
CPU utilization is low. Disk and network i/o is high. I don't know if
your application supports multiple backeneds, but chances are good
your biggest wins will come from RAM, disk and network investments.

--
Matthew Nuzum
www.bearfruit.org

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Marc Cousin 2005-09-06 15:13:41 Re: insert performance for win32
Previous Message Brian Choate 2005-09-06 15:04:22 Poor performance of delete by primary key