From: | Doug Fields <dfields-pg-general(at)pexicom(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org, pgsql-jdbc(at)postgresql(dot)org |
Subject: | Large import via temporary table: all at once, or split into chunks? |
Date: | 2002-12-10 16:38:52 |
Message-ID: | 5.1.0.14.2.20021210112440.032f3ad8@pop.pexicom.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hello all,
(We're using JDBC to do this, and server 7.2.1-2woody2 on Debian 3.0)
We regularly handle large imports of data. (Questions at the end.)
The data goes into a table with these salient features:
* About 100 million rows at this time
* About 25 columns: mostly VARCHAR, two TIMESTAMP (one DEFAULT now()), some
INTEGER, some boolean
* One index on a SERIAL column (the primary key)
* One index on two int columns (a secondary key and the primary key,
combined index)
* One index on a VARCHAR column
* WITHOUT OIDs
The way we currently import the new records is:
1) Create a temporary table - WITHOUT OIDS and without any indices
2) INSERT INTO temptable VALUES (...); for each piece of data, and usually
500 of these in one JDBC statement, for efficiency
3) INSERT INTO finaltable (secondaryKey, columns...) SELECT ###, * FROM
temptable; - where ### is the salient secondary key from above, and the
serial column is populated by the DEFAULT
4) Drop the temporary table
With 7.2 JDBC, we can't use the COPY functionality to speed it up.
My questions are these: When doing the above process for, say, a million to
ten million rows (our typical size), how expensive is step #3? It takes
four hours to complete on a 1 million row table, and I have no idea how
badly it is harming concurrency, since it's moving so much data in one
transaction. Would I be better off importing some lesser number of rows to
the temporary table in step 2 (how many?) and then doing step 3, then
truncating the temporary table and starting over with step 2 again? How do
these choices affect the sizing of the transaction log parameters?
Thanks,
Doug
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Copeland | 2002-12-10 16:56:32 | Re: 7.4 Wishlist |
Previous Message | Hegyvari Krisztian | 2002-12-10 16:12:37 | Re: ExecEvalExpr: unknown expression type 108 |