From: | Greg Spiegelberg <gspiegelberg(at)gmail(dot)com> |
---|---|
To: | pgsql-performance <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Tuning massive UPDATES and GROUP BY's? |
Date: | 2011-03-14 13:34:28 |
Message-ID: | AANLkTikqxB3EmnvgxYEGPwOg50r0K1JD=u651puWOzqW@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Mon, Mar 14, 2011 at 4:17 AM, Marti Raudsepp <marti(at)juffo(dot)org> wrote:
> On Sun, Mar 13, 2011 at 18:36, runner <runner(at)winning(dot)com> wrote:
> > Other than being very inefficient, and consuming
> > more time than necessary, is there any other down side to importing
> > into an indexed table?
>
> Doing so will result in somewhat larger (more bloated) indexes, but
> generally the performance impact of this is minimal.
>
>
Bulk data imports of this size I've done with minimal pain by simply
breaking the raw data into chunks (10M records becomes 10 files of 1M
records), on a separate spindle from the database, and performing multiple
COPY commands but no more than 1 COPY per server core. I tested this a
while back on a 4 core server and when I attempted 5 COPY's at a time the
time to complete went up almost 30%. I don't recall any benefit having
fewer than 4 in this case but the server was only processing my data at the
time. Indexes were on the target table however I dropped all constraints.
The UNIX split command is handy for breaking the data up into individual
files.
Greg
From | Date | Subject | |
---|---|---|---|
Next Message | Merlin Moncure | 2011-03-14 13:41:38 | Re: unexpected stable function behavior |
Previous Message | Marti Raudsepp | 2011-03-14 10:17:34 | Re: Tuning massive UPDATES and GROUP BY's? |