From: | Mihai Popa <mihai(at)lattica(dot)com> |
---|---|
To: | Bill Moran <wmoran(at)potentialtech(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: large database |
Date: | 2012-12-11 15:28:12 |
Message-ID: | 50C7510C.3030605@lattica.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 12/11/2012 07:27 AM, Bill Moran wrote:
> On Mon, 10 Dec 2012 15:26:02 -0500 (EST) "Mihai Popa" <mihai(at)lattica(dot)com> wrote:
>
>> Hi,
>>
>> I've recently inherited a project that involves importing a large set of
>> Access mdb files into a Postgres or MySQL database.
>> The process is to export the mdb's to comma separated files than import
>> those into the final database.
>> We are now at the point where the csv files are all created and amount
>> to some 300 GB of data.
>>
>> I would like to get some advice on the best deployment option.
>>
>> First, the project has been started using MySQL. Is it worth switching
>> to Postgres and if so, which version should I use?
> I've been managing a few large databases this year, on both PostgreSQL and
> MySQL.
>
> Don't put your data in MySQL. Ever. If you feel like you need to use
> something like MySQL, just go straight to a system that was designed with
> no constraints right off the bat, like Mongo or something.
I've never worked with MySQL before; I did work with Postgres a lot over
the last few years, but never
with such large databases, so I cannot really choose one over the other;
hence my posting:)
> and the fact that if you use anything other than INT AUTO_INCREMENT for
> your primary key you're liable to hit on awful inefficiencies.
Unfortunately, I don't know much yet about the usage pattern; all I know
is that the data is mostly
read only, there will be a few updates every year, but they will
probably happen as batch jobs over night.
And meanwhile it appears there is a lot more of it: 800 GB rather than
300 as initially thought.
There aren't a lot of tables so each will have a large number of rows.
I guess Chris was right, I have to better understand the usage pattern
and do some testing of my own.
I was just hoping my hunch about Amazon being the better alternative
would be confirmed, but this does not
seem to be the case; most of you recommend purchasing a box.
I want to thank everyone for the input, really appreciate it!
regards,
mihai
From | Date | Subject | |
---|---|---|---|
Next Message | Misa Simic | 2012-12-11 15:51:46 | Postgresql PL parallel processing inside Postgresql function.... |
Previous Message | Merlin Moncure | 2012-12-11 15:01:26 | Re: Problem with aborting entire transactions on error |