Re: [SQL] best strategy doing a large copy and using indexes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dirk Lutzebaeck <lutzeb(at)aeccom(dot)com>
Cc: pgsql-sql(at)hub(dot)org
Subject: Re: [SQL] best strategy doing a large copy and using indexes
Date: 2000-01-11 16:14:35
Message-ID: 4618.947607275@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Dirk Lutzebaeck <lutzeb(at)aeccom(dot)com> writes:
> I have a table with 7 individual indexes (it models an object store
> with different types) and need to copy large data lists into this
> table regularly. I'm using COPY for it but it is unacceptable slow
> (4min for 1000 rows on a Linux/PII-400). I'm anticipating more than
> 20000 rows to copy. What combination of the following options is the
> best strategy to make this more efficient?

> a) drop indexes before COPY and recreate them after COPY

That should certainly help a great deal (unless the 20000 rows are
a small number compared to what's already in the table --- in that
case, the cost of reindexing the existing rows might outweigh the
efficiency of indexing the new rows "wholesale" instead of "retail").

Also, if you have any triggers or anything like that, you might consider
turning them off for the duration of the copy.

> b) put COPY in a transaction

Won't do anything --- COPY is only one transaction command anyway.

> c) do VACUUM before (or after?)
> d) do VACUUM ANALYZE before (or after?)

These wouldn't directly affect the speed of COPY, afaik. You should
consider doing a VACUUM ANALYZE after adding any large amount of data
to a table, so that the optimizer has reasonably up-to-date info about
the size of the table. But that's only going to affect subsequent
queries, not the COPY itself.

> e) COPY to temp table and then make an INSERT SELECT FROM temp table

That would be more total elapsed time, but if your main concern is to
minimize the downtime of your primary table, I suppose there could be
reason to do it that way.

regards, tom lane

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Web Manager 2000-01-11 20:29:11 Problem with array syntax
Previous Message Thomas Lockhart 2000-01-11 15:57:09 Re: [HACKERS] Re: [SQL] createdb -D xxxx not working