Re: Postgres for a "data warehouse", 5-10 TB

From: Shaun Thomas <sthomas(at)peak6(dot)com>
To: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
Cc: Marti Raudsepp <marti(at)juffo(dot)org>, Andy Colson <andy(at)squeakycode(dot)net>, Igor Chudov <ichudov(at)gmail(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Postgres for a "data warehouse", 5-10 TB
Date: 2011-09-12 20:04:40
Message-ID: 4E6E65D8.4050403@peak6.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 09/12/2011 02:48 PM, Scott Marlowe wrote:

> I put it to you that your hardware has problems if you have a pg db
> that's corrupting from having too much vacuum activity.

What? No. We optimized by basically forcing autovacuum to never run
during our active periods. We never actually encountered wrap-around
corruption. I was just saying that 600M is a relatively high setting for
autovacuum_freeze_max_age. :)

I was alluding to the fact that if a DBA had his system running for a
week at our transaction level, and PG didn't have forced auto vacuum,
and their maintenance lapsed even slightly, they could end up with a
corrupt database. Not too far-fetched for someone coming from MySQL, really.

Our problem is we run a financial site, and the front-end very
aggressively monitors network and database timeouts. The limit is
sufficiently low that a vacuum would cause enough IO to trigger
application timeouts, even with vacuum_cost_delay. And of course,
setting vacuum_cost_delay too high quickly triples or quadruples vacuum
times. Now that we're using FusionIO cards, I've been thinking about
turning autovacuum back on, but I want to run some tests first.

My point stands, though. Don't go crazy with transactions until you know
your config can stand up to it, and reduce if possible. We found some
tweak points that drastically reduced transaction count with no
detrimental effect on the app itself, so we jumped on them.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
312-676-8870
sthomas(at)peak6(dot)com

______________________________________________

See http://www.peak6.com/email-disclaimer/ for terms and conditions related to this email

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Marti Raudsepp 2011-09-12 20:19:29 Re: Postgres for a "data warehouse", 5-10 TB
Previous Message Scott Marlowe 2011-09-12 19:48:57 Re: Postgres for a "data warehouse", 5-10 TB