Re: Tuning massive UPDATES and GROUP BY's?

From: runner <runner(at)winning(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Tuning massive UPDATES and GROUP BY's?
Date: 2011-03-13 16:36:26
Message-ID: 8CDAFB3EE5DFE5F-1AE0-21557@web-mmc-d02.sysops.aol.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

>Don't insert data into an indexed table. A very important point with

>bulk-loading is that you should load all the data first, then create
>the indexes. Running multiple (different) CREATE INDEX queries in
>parallel can additionally save a lot of time. Also don't move data
>back and forth between the tables, just drop the original when you're
>done.

I just saw your post and it looks similar to what I'm doing.
We're going to be loading 12G of data from a MySQL dump into our
pg 9.0.3 database next weekend. I've been testing this for the last
two weeks. Tried removing the indexes and other constraints just for
the import but for a noob like me, this was too much to ask. Maybe
when I get more experience. So I *WILL* be importing all of my data
into indexed tables. I timed it and it will take eight hours.

I'm sure I could get it down to two or three hours for the import
if I really knew more about postgres but that's the price you pay when
you "slam dunk" a project and your staff isn't familiar with the
database back-end. Other than being very inefficient, and consuming
more time than necessary, is there any other down side to importing
into an indexed table? In the four test imports I've done,
everything seems to work fine, just takes a long time.

Sorry for hijacking your thread here!


In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2011-03-13 22:10:29 Re: Planner wrongly shuns multi-column index for select .. order by col1, col2 limit 1
Previous Message Tom Lane 2011-03-13 16:24:41 Re: Planner wrongly shuns multi-column index for select .. order by col1, col2 limit 1