Re: Tuning massive UPDATES and GROUP BY's?

From: Greg Spiegelberg <gspiegelberg(at)gmail(dot)com>
To: pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Tuning massive UPDATES and GROUP BY's?
Date: 2011-03-14 13:34:28
Message-ID: AANLkTikqxB3EmnvgxYEGPwOg50r0K1JD=u651puWOzqW@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Mar 14, 2011 at 4:17 AM, Marti Raudsepp <marti(at)juffo(dot)org> wrote:

> On Sun, Mar 13, 2011 at 18:36, runner <runner(at)winning(dot)com> wrote:
> > Other than being very inefficient, and consuming
> > more time than necessary, is there any other down side to importing
> > into an indexed table?
>
> Doing so will result in somewhat larger (more bloated) indexes, but
> generally the performance impact of this is minimal.
>
>
Bulk data imports of this size I've done with minimal pain by simply
breaking the raw data into chunks (10M records becomes 10 files of 1M
records), on a separate spindle from the database, and performing multiple
COPY commands but no more than 1 COPY per server core. I tested this a
while back on a 4 core server and when I attempted 5 COPY's at a time the
time to complete went up almost 30%. I don't recall any benefit having
fewer than 4 in this case but the server was only processing my data at the
time. Indexes were on the target table however I dropped all constraints.
The UNIX split command is handy for breaking the data up into individual
files.

Greg

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Merlin Moncure 2011-03-14 13:41:38 Re: unexpected stable function behavior
Previous Message Marti Raudsepp 2011-03-14 10:17:34 Re: Tuning massive UPDATES and GROUP BY's?