From: | Christopher Deckers <chrriis(at)gmail(dot)com> |
---|---|
To: | pgsql-jdbc(at)postgresql(dot)org |
Subject: | Batches of single-insert statements vs batches of multi-insert statements |
Date: | 2016-06-05 18:17:54 |
Message-ID: | CADFnS4TP2tqF5qXDpvAJykdREoZtAQ9nJZNrN3Xh4MQxTfAv7g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-jdbc |
Hi all,
I tried all sorts of query rewrite to boost a huge sequence of insert
statements and here are my findings. All is run on a local Postgres 9.5.3,
on a 64-bit Windows box with 32GB RAM, i7(at)3(dot)6GHz(dot)
1. Batches of single insert statements (original case):
Batch of: INSERT INTO SomeTable (Col1, Col2) VALUES (?, ?)
-> 52 seconds
2. Batches of single insert statements to an unlogged table, then copy:
CREATE UNLOGGED TABLE SomeUnloggedTable AS SELECT * FROM SomeTable WHERE 1
= 0
Batch of: INSERT INTO SomeUnloggedTable (Col1, Col2) VALUES (?, ?)
INSERT INTO SomeTable SELECT * FROM SomeUnloggedTable
DROP TABLE SomeUnloggedTable
-> 43 seconds (15 seconds in the copy from unlogged to target table)
3. Batches of 100 multi-insert (followed by batches of single inserts for
remainder):
Batch of: INSERT INTO SomeTable (Col1, Col2) VALUES (?, ?), (?, ?), (?,
?)...
-> 30 seconds
So, rewriting the batch of single insert statements to batches of 100
multi-insert statements almost doubled the speed.
Could someone educate me as to what happens internally to the JDBC
single-insert batch that makes it so different from a multi-insert batch?
Rewriting our code to use multi-value inserts is more cumbersome and not
applicable everywhere. It also makes the code database-specific as all the
engines we support do not all handle this syntax.
These considerations actually lead me to this old thread:
https://www.postgresql.org/message-id/55130DC8.2070508%40redhat.com
Anyway, these tests definitely show that the performance of the standard
JDBC batch API could potentially be improved in Postgres internally with a
bit of magic :)
Isn't it possible (either in the driver or in the engine) to aggregate the
single-insert statements of a batch to turn them into a multi-insert
equivalent? It would not have to handle all cases: simply handling basic
commands like "INSERT INTO X (a, b, c, ...) VALUES (?, ?, ?, ...)" would
help a lot.
If this should not be done in the JDBC driver, then in which area could
such handling be done?
Or have I missed something?
If needed, I am ready to offer my assistance with testing (I can tweak
parameters, install dev drivers, modify my code to run alternate
statements, experiment, etc.).
Cheers,
-Christopher
From | Date | Subject | |
---|---|---|---|
Next Message | Vladimir Sitnikov | 2016-06-05 18:33:46 | Re: Batches of single-insert statements vs batches of multi-insert statements |
Previous Message | Vladimir Sitnikov | 2016-06-01 11:23:29 | Re: Re: [BUGS] BUG #14166: JDBC driver won't parse error message from DB server in German |