Re: large database

From: Jan Kesten <jan(at)dafuer(dot)de>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: large database
Date: 2012-12-11 08:10:23
Message-ID: 50C6EA6F.1050903@dafuer.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Mihai.

> We are now at the point where the csv files are all created and amount
> to some 300 GB of data.

> I would like to get some advice on the best deployment option.

First - and maybe best - advice: Do some testing on your own and plan
some time for this.

> First, the project has been started using MySQL. Is it worth switching
> to Postgres and if so, which version should I use?

When switching to PostgreSQL I would recommend to use the latest stable
version. But your project is already running in MySQL - are there issues
you expect to solve with switching to another database system? If not:
why switching?

> Second, where should I deploy it? The cloud or a dedicated box?

Given 1TB of storage, the x-large instance and 10000 provisioned IOPS
would mean about 2000USD for a 100% utilized instance on amazon. This is
not really ultra-cheap ;-) For two months running you can get a
dedicated server with eight drives, buy to extra SSDs and have full
control on a Dell server. But things get much cheaper if real IOPS are
not at such high rate.

Also when using a cloud infrastructure and need your data on local
system keep network latency in mind.

We have several huge PostgreSQL databases running and have used
OpenIndina with ZFS and SSDs for data storage for quite a while now and
works perfect.

There are some sildes from Sun/Oracle about ZFS, ZIL, SSD and PostgreSQL
performance (I can look if I find them if needed).

> Alternatively I looked at a Dell server with 32 GB of RAM and some
> really good hard drives. But such a box does not come cheap and I don't
> want to keep the pieces if it doesn't cut it

Just a hint: Do not simply look at Dells prices - phone them and get a
quote. I was surprised (but do not buy SSDs there).

Think about how you data is structured and how it is queried after it
was imported into the database to see where your bottlenecks are.

Cheers,
Jan

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Johannes Lochmann 2012-12-11 09:47:34 Re: large database
Previous Message Fan, Yi 2012-12-11 07:34:23 Re: Looking for cooperators