Re: JDBC slow performance on resultSet iteration?

From: Rui Pedro Leal <rui(dot)pedro(dot)leal(at)gmail(dot)com>
To: Kevin Wooten <kdubb(at)me(dot)com>
Cc: Dave Cramer <pg(at)fastcrypt(dot)com>, List <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC slow performance on resultSet iteration?
Date: 2013-11-05 18:27:04
Message-ID: 18C0AA13-5B70-461B-9C2A-908EEBC86EBB@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

Hello Kevin,

That's exactly my immediate (and hopefully wrong) conclusion: the getXXX calls are really costly!

So, you are proposing using the binary protocol. I've never known the existence of that. Is this it: http://wiki.postgresql.org/wiki/JDBC-BinaryTransfer ?

Thanks.

[] Rui Pedro Leal
[] rui(dot)pedro(dot)leal(at)gmail(dot)com

On 2013/11/05, at 18:14, Kevin Wooten wrote:

> Well I can speak from experience…
>
> When I was comparing the performance of my implementation with pgjdbc I was very disheartened at the beginning as pgjdbc was seemingly much faster. It was only after I switched the test to take into account the getXXX calls that the true cost of using the driver was discovered. If you just execute a query and ignore the results pgjdbc is essentially just receiving a stream of text. It ignores all of the work of turning that text into Java objects/primitives.
>
> My implementation uses binary protocol for this reason and that removes a lot of the overhead that pgjdbc has. If pgAdmin is using the binary protocol as well this could very well be the difference that is being seen, because it is a significant amount of time.
>
> On Nov 5, 2013, at 11:03 AM, Dave Cramer <pg(at)fastcrypt(dot)com> wrote:
>
>> I do not know, the best way to figure this out is to use explain analyze in pg_admin.
>>
>> What I am debating is the java code is using a cursor. pgAdmin is not.
>>
>> Try it without setFetchSize()
>>
>> Dave Cramer
>>
>> dave.cramer(at)credativ(dot)ca
>> http://www.credativ.ca
>>
>>
>> On Tue, Nov 5, 2013 at 1:00 PM, Rui Pedro Leal <rui(dot)pedro(dot)leal(at)gmail(dot)com> wrote:
>> Dave,
>>
>> I understand what you are saying, but i must get the results, right? :)
>>
>> I think in pgAdmin the total query time, presented by the query execution windows, is also considering the rendering / presenting of the query results, so this is comparable ... right?
>>
>> [] Rui Pedro Leal
>> [] rui(dot)pedro(dot)leal(at)gmail(dot)com
>>
>> On 2013/11/05, at 17:34, Dave Cramer wrote:
>>
>>> Rui Leal,
>>>
>>> You shouldn't see that much of a penatly, but I question how you are measuring it. Since you are timing the loop which is using cursors.
>>>
>>> Just run a simple query and time it without iterating through the result set. That would be equivalent
>>>
>>> Dave Cramer
>>>
>>> dave.cramer(at)credativ(dot)ca
>>> http://www.credativ.ca
>>>
>>>
>>> On Tue, Nov 5, 2013 at 12:16 PM, Rui Pedro Leal <rui(dot)pedro(dot)leal(at)gmail(dot)com> wrote:
>>> Hello Dave, thanks for the quick reply.
>>>
>>> I'm just executing the same query on pgAdmin query interface. Just a way to measure performance of the query + getting results vs the JDBC way.
>>>
>>> I'm using fetchsize in order to get some faster results when iterating the resultSet. I've tried not to use fetchSize/cursor but the performance is naturally slower (and worser than the simple pgAdmin query).
>>>
>>> What i'm asking is .. is this performance penalty i'm getting using postgres JDBC normal and within what's expected? I'm i doing something wrong? Can i do it in any other way?
>>>
>>> Regards,
>>>
>>> Rui Leal
>>>
>>> [] Rui Pedro Leal
>>> [] rui(dot)pedro(dot)leal(at)gmail(dot)com
>>>
>>>
>>> On 2013/11/05, at 16:43, Dave Cramer wrote:
>>>
>>>> How can you measure the same query in pgAdmin ? JDBC will use a cursor when you set the fetchsize.
>>>>
>>>> Dave Cramer
>>>>
>>>> dave.cramer(at)credativ(dot)ca
>>>> http://www.credativ.ca
>>>>
>>>>
>>>> On Tue, Nov 5, 2013 at 11:12 AM, Rui Pedro Leal <rui(dot)pedro(dot)leal(at)gmail(dot)com> wrote:
>>>> Just an quick info update: i'm using the postgresql-9.2-1003-jdbc4 driver
>>>>
>>>> On 2013/11/05, at 16:09, Rui Pedro Leal wrote:
>>>>
>>>> > Hello everyone,
>>>> >
>>>> > I'm having slow JDBC performance on iterating a resultSet obtained from a somewhat simple query.
>>>> >
>>>> > The query, although using PostGIS functions, is pretty straight forward:
>>>> >
>>>> > SELECT _id, _spatial_type, ST_AsBinary(_geometry), _attribute, _count, _references, countpersons, countfatals, countdrunks, density
>>>> > FROM accidents_5
>>>> > WHERE (_geometry && ST_MakeEnvelope(-126.60644531250001,26.43122806450644,-63.369140625,52.96187505907603, 4326) )
>>>> > ORDER BY _pk_id
>>>> >
>>>> > The results are the following:
>>>> > - Executing the query directly from pgAdmin: ~2807ms
>>>> > - Executing from JVM + JDBC: 4184ms
>>>> >
>>>> > The code i'm executing is pretty much standard:
>>>> >
>>>> > -------------------- code --------------------
>>>> >
>>>> > public static void main(final String[] args) throws Exception {
>>>> >
>>>> > String sql = "SELECT _id, _spatial_type, ST_AsBinary(_geometry), _attribute, _count, _references, countpersons, countfatals, countdrunks, density";
>>>> > sql += " FROM accidents_5";
>>>> > sql += " WHERE (_geometry && ST_MakeEnvelope(-126.60644531250001,26.43122806450644,-63.369140625,52.96187505907603, 4326) )";
>>>> > sql += " ORDER BY _pk_id";
>>>> >
>>>> > System.out.println(sql);
>>>> >
>>>> > Long time = System.currentTimeMillis();
>>>> > try {
>>>> > Connection connection = DataStoreInfo.getDataStores().get(0);
>>>> > connection.setAutoCommit(false);
>>>> >
>>>> > System.out.println("[QUERY " + Thread.currentThread().getId()
>>>> > + "] - connection in "
>>>> > + (System.currentTimeMillis() - time) + "ms.");
>>>> >
>>>> > Statement st = connection.createStatement(
>>>> > ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY,
>>>> > ResultSet.CLOSE_CURSORS_AT_COMMIT);
>>>> > st.setFetchSize(250);
>>>> >
>>>> > System.out.println("[QUERY " + Thread.currentThread().getId()
>>>> > + "] - statement in " + (System.currentTimeMillis() - time)
>>>> > + "ms.");
>>>> >
>>>> > ResultSet resultSet = st.executeQuery(sql);
>>>> >
>>>> > System.out.println("[QUERY " + Thread.currentThread().getId()
>>>> > + "] - done in " + (System.currentTimeMillis() - time)
>>>> > + "ms.");
>>>> >
>>>> > GeometryFactory geofact = new GeometryFactory(new PrecisionModel(),4326);
>>>> > WKBReader wkbReader = new WKBReader(geofact);
>>>> >
>>>> > String id;
>>>> > String spatialType;
>>>> > Geometry geometry;
>>>> > String attribute;
>>>> > Long count;
>>>> > String reference;
>>>> > Map<String, Double> properties;
>>>> > String granularSynthString = "GranularSynthesis";
>>>> >
>>>> > while (resultSet.next()) {
>>>> > id = resultSet.getString(1);
>>>> > spatialType = resultSet.getString(2);
>>>> >
>>>> > // geometry = wkbReader.read(resultSet.getBytes(3)); // ignored the WKBReader and the results are about the same.
>>>> > attribute = resultSet.getString(4);
>>>> > count = resultSet.getLong(5);
>>>> > reference = resultSet.getString(6);
>>>> >
>>>> > properties = new HashMap<String, Double>();
>>>> > Double aux = resultSet.getDouble(7);
>>>> > properties.put("countpersons", aux);
>>>> > aux = resultSet.getDouble(8);
>>>> > properties.put("countfatals", aux);
>>>> > aux = resultSet.getDouble(9);
>>>> > properties.put("countdrunks", aux);
>>>> > aux = resultSet.getDouble(10);
>>>> > properties.put("density", aux);
>>>> > }
>>>> >
>>>> > System.out.println("[QUERY " + Thread.currentThread().getId()
>>>> > + "] - done & iterated in "
>>>> > + (System.currentTimeMillis() - time) + "ms.");
>>>> >
>>>> > resultSet.close();
>>>> > st.close();
>>>> > connection.commit();
>>>> > connection.close();
>>>> > } catch (SQLException exception) {
>>>> > exception.printStackTrace();
>>>> > // } catch (ParseException exception) { // ignored from WKBreader
>>>> > // exception.printStackTrace();
>>>> > }
>>>> >
>>>> > System.out.println("[End " + Thread.currentThread().getId()
>>>> > + "] - done in " + (System.currentTimeMillis() - time) + "ms.");
>>>> > }
>>>> >
>>>> > -------------------- end code --------------------
>>>> >
>>>> > Although i'm executing this on slow MacbookPro (2.6 core duo, 2Gbs RAM but SSD) and have a 9.1.2 postgres, i've also tested this on a recent retina MBP and the ratio between pgAdmin and JDBC execution is similar.
>>>> >
>>>> > Is this expected? Can someone point if i'm doing something terrible wrong?
>>>> >
>>>> > I'm not concerned about the query performance per-se (i know it CAN be optimized), but the differences just using JDBC and iterating the resultSet are really annoying.
>>>> >
>>>> > Thanks in advance for any help.
>>>> >
>>>> > Kind regards,
>>>> >
>>>> > Rui Leal
>>>> >
>>>> >
>>>>
>>>>
>>>>
>>>> --
>>>> Sent via pgsql-jdbc mailing list (pgsql-jdbc(at)postgresql(dot)org)
>>>> To make changes to your subscription:
>>>> http://www.postgresql.org/mailpref/pgsql-jdbc
>>>>
>>>
>>>
>>
>>
>

In response to

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Rui Pedro Leal 2013-11-05 18:34:17 Re: JDBC slow performance on resultSet iteration?
Previous Message Dave Cramer 2013-11-05 18:25:53 Re: JDBC slow performance on resultSet iteration?