Re: Postgres for a "data warehouse", 5-10 TB

From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: sthomas(at)peak6(dot)com
Cc: Marti Raudsepp <marti(at)juffo(dot)org>, Andy Colson <andy(at)squeakycode(dot)net>, Igor Chudov <ichudov(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Postgres for a "data warehouse", 5-10 TB
Date: 2011-09-12 19:48:57
Message-ID: CAOR=d=2LG26r0NH-+wn7FhTnsR1G1r8gveN9YWDfZW9Z+3dvsQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Sep 12, 2011 at 10:22 AM, Shaun Thomas <sthomas(at)peak6(dot)com> wrote:
> On 09/11/2011 12:02 PM, Marti Raudsepp wrote:
>
>> Which brings me to another important point: don't do lots of small
>> write transactions, SAVEPOINTs or PL/pgSQL subtransactions. Besides
>> being inefficient, they introduce a big maintenance burden.
>
> I'd like to second this. Before a notable application overhaul, we were
> handling about 300-million transactions per day (250M of that was over a
> 6-hour period). To avoid the risk of mid-day vacuum-freeze, we disabled
> autovacuum and run a nightly vacuum over the entire database. And that was
> *after* bumping  autovacuum_freeze_max_age to 600-million.
>
> You do *not* want to screw with that if you don't have to, and a setting of
> 600M is about 1/3 of the reasonable boundary there. If not for the forced
> autovacuums, a database with this much traffic would be corrupt in less than
> a week. We've managed to cut that transaction traffic by 60%, and it greatly
> improved the database's overall health.

I put it to you that your hardware has problems if you have a pg db
that's corrupting from having too much vacuum activity. I've had
exactly one pg corruption problem in the past, and it was a bad SATA
hard drive on a stats server. I have four 48 core opterons running
quite hard during the day, have autovacuum on and VERY aggresively
tuned and have had zero corruption issues in over 3 years of hard
running.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Shaun Thomas 2011-09-12 20:04:40 Re: Postgres for a "data warehouse", 5-10 TB
Previous Message pasman pasmański 2011-09-12 19:12:02 Re: Allow sorts to use more available memory