speeding up inserts by pipelining

From: Todd Owen <daxiang(dot)singer(at)neverbox(dot)com>
To: pgsql-odbc(at)postgresql(dot)org
Subject: speeding up inserts by pipelining
Date: 2013-03-04 10:01:10
Message-ID: CACAL9v0uV=nV4WsmntsvBMxxm+3Avpf-taA0TZUxchNgf1V3KQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-odbc

I'm trying to populate a large temporary table (a few million rows), and
running into performance woes. In my initial tests, inserting 100,000 rows
took 2.5 minutes, which I managed to get down to 1 minute by (a) using an
insert buffer, i.e. using SQL_ATTR_PARAMSET_SIZE with SQLExecute, and (b)
turning on the "Server Side Prepare" option. However, this is still 10
times slower than writing the data to a file and then running "\copy" in
psql.

I know this complaint is not new :) Earlier posts on this mailing list have
suggested concatenating values into a single string which can be
transformed back into multiple rows using a custom server-side function.
I've also seen some talk about SQLBulkOperations. But I'm unable to
implement either of these because the ODBC client is not our code, rather
it's an off-the-shelf data integration package running on Windows. The size
of the insert buffer is one of the few configurable options.

At the moment, increasing the insert buffer has only limited effect because
(as far as I can tell) each row still requires a round trip to the server.
Network latency then becomes the major bottleneck, and indeed I've run some
tests that show insert performance being much better when the client and
server are both running on the same machine.

There was a recent post on pgsql-general where somebody asked about
pipelining inserts, i.e. sending multiple commands to the server without
waiting for it to acknowledge the completion of each one. Tom Lane said (
http://www.postgresql.org/message-id/26269.1356282651@sss.pgh.pa.us) that
the wire protocol does support this, although libpq is "too simple-minded
to deal with more than one query in flight". Of course libpq support should
not matter to psqlODBC, which I understand implements most of the wire
protocol natively.

Is there any chance of psqlODBC one day supporting query pipelining inside
the scope of calls to SQLExecute? This doesn't seem like too much effort to
implement, although the tricky part is probably the semantics in the case
where some rows succeed and some fail (constaint violations, etc). I'm
tempted to give it a try myself, if I manage to get the build working under
VS2010...

Regards,

Todd

Responses

Browse pgsql-odbc by date

  From Date Subject
Next Message Heikki Linnakangas 2013-03-04 11:13:58 Re: speeding up inserts by pipelining
Previous Message Heikki Linnakangas 2013-02-22 09:27:40 Re: Source code cleanup