Re: large database

From: Mihai Popa <mihai(at)lattica(dot)com>
To: Bill Moran <wmoran(at)potentialtech(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: large database
Date: 2012-12-11 15:28:12
Message-ID: 50C7510C.3030605@lattica.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 12/11/2012 07:27 AM, Bill Moran wrote:
> On Mon, 10 Dec 2012 15:26:02 -0500 (EST) "Mihai Popa" <mihai(at)lattica(dot)com> wrote:
>
>> Hi,
>>
>> I've recently inherited a project that involves importing a large set of
>> Access mdb files into a Postgres or MySQL database.
>> The process is to export the mdb's to comma separated files than import
>> those into the final database.
>> We are now at the point where the csv files are all created and amount
>> to some 300 GB of data.
>>
>> I would like to get some advice on the best deployment option.
>>
>> First, the project has been started using MySQL. Is it worth switching
>> to Postgres and if so, which version should I use?
> I've been managing a few large databases this year, on both PostgreSQL and
> MySQL.
>
> Don't put your data in MySQL. Ever. If you feel like you need to use
> something like MySQL, just go straight to a system that was designed with
> no constraints right off the bat, like Mongo or something.

I've never worked with MySQL before; I did work with Postgres a lot over
the last few years, but never
with such large databases, so I cannot really choose one over the other;
hence my posting:)
> and the fact that if you use anything other than INT AUTO_INCREMENT for
> your primary key you're liable to hit on awful inefficiencies.

Unfortunately, I don't know much yet about the usage pattern; all I know
is that the data is mostly
read only, there will be a few updates every year, but they will
probably happen as batch jobs over night.
And meanwhile it appears there is a lot more of it: 800 GB rather than
300 as initially thought.
There aren't a lot of tables so each will have a large number of rows.

I guess Chris was right, I have to better understand the usage pattern
and do some testing of my own.
I was just hoping my hunch about Amazon being the better alternative
would be confirmed, but this does not
seem to be the case; most of you recommend purchasing a box.

I want to thank everyone for the input, really appreciate it!

regards,
mihai

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Misa Simic 2012-12-11 15:51:46 Postgresql PL parallel processing inside Postgresql function....
Previous Message Merlin Moncure 2012-12-11 15:01:26 Re: Problem with aborting entire transactions on error