| From: | Thomas Kellerer <spam_eater(at)gmx(dot)net> |
|---|---|
| To: | pgsql-general(at)postgresql(dot)org |
| Subject: | Re: COPY v. java performance comparison |
| Date: | 2014-04-03 15:01:36 |
| Message-ID: | lhjt4g$rr6$1@ger.gmane.org |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-general |
Rob Sargent, 02.04.2014 21:37:
> I loaded 37M+ records using jOOQ (batching every 1000 lines) in 12+
> hours (800+ records/sec). Then I tried COPY and killed that after
> 11.25 hours when I realised that I had added on non-unque index on
> the name fields after the first load. By that point is was on line
> 28301887, so ~0.75 done which implies it would have take ~15hours to
> complete.
>
> Would the overhead of the index likely explain this decrease in
> throughput?
>
> Impatience got the better of me and I killed the second COPY. This
> time it had done 54% of the file in 6.75 hours, extrapolating to
> roughly 12 hours to do the whole thing.
>
> That matches up with the java speed. Not sure if I should be elated
> with jOOQ or disappointed with COPY.
>
This is not what I see with COPY FROM STDIN
When I load 2million rows using a batch size of 1000 with plain JDBC that takes about 4 minutes
Loading the same file through Java and COPY FROM STDIN takes about 4 seconds
The table looks like this:
Table "public.products"
Column | Type | Modifiers
-------------------+------------------------+-----------
product_id | integer | not null
ean_code | bigint | not null
product_name | character varying(100) | not null
manufacturer_name | character varying | not null
price | numeric(10,2) | not null
publish_date | date | not null
Indexes:
"products_pkey" PRIMARY KEY, btree (product_id)
"idx_publish_date" btree (publish_date, product_id)
During the load both indexes are present.
Regards
Thomas
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Francisco Olarte | 2014-04-03 15:17:42 | Re: Any way to insert rows with ID used in another column |
| Previous Message | Adrian Klaver | 2014-04-03 14:59:19 | Re: Any way to insert rows with ID used in another column |