Quick Links

Re: Tuning massive UPDATES and GROUP BY's?

From:	Greg Spiegelberg <gspiegelberg(at)gmail(dot)com>
To:	pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: Tuning massive UPDATES and GROUP BY's?
Date:	2011-03-14 13:34:28
Message-ID:	AANLkTikqxB3EmnvgxYEGPwOg50r0K1JD=u651puWOzqW@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On Mon, Mar 14, 2011 at 4:17 AM, Marti Raudsepp <marti(at)juffo(dot)org> wrote:

> On Sun, Mar 13, 2011 at 18:36, runner <runner(at)winning(dot)com> wrote:
> > Other than being very inefficient, and consuming
> > more time than necessary, is there any other down side to importing
> > into an indexed table?
>
> Doing so will result in somewhat larger (more bloated) indexes, but
> generally the performance impact of this is minimal.
>
>
Bulk data imports of this size I've done with minimal pain by simply
breaking the raw data into chunks (10M records becomes 10 files of 1M
records), on a separate spindle from the database, and performing multiple
COPY commands but no more than 1 COPY per server core. I tested this a
while back on a 4 core server and when I attempted 5 COPY's at a time the
time to complete went up almost 30%. I don't recall any benefit having
fewer than 4 in this case but the server was only processing my data at the
time. Indexes were on the target table however I dropped all constraints.
The UNIX split command is handy for breaking the data up into individual
files.

Greg

In response to

Re: Tuning massive UPDATES and GROUP BY's? at 2011-03-14 10:17:34 from Marti Raudsepp

Responses

Re: Tuning massive UPDATES and GROUP BY's? at 2011-03-14 15:54:39 from runner

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Merlin Moncure	2011-03-14 13:41:38	Re: unexpected stable function behavior
Previous Message	Marti Raudsepp	2011-03-14 10:17:34	Re: Tuning massive UPDATES and GROUP BY's?