Re: JDBC slow performance on resultSet iteration?

From: Dave Cramer <pg(at)fastcrypt(dot)com>
To: Kevin Wooten <kdubb(at)me(dot)com>
Cc: List <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC slow performance on resultSet iteration?
Date: 2013-11-05 18:25:53
Message-ID: CADK3HH+j_XSrfnjnwa-FiHT8MstcCnndoM9OYbCneLjwOJxQFA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

essentially everything is a prepared statement, and AFAIK, it is using
binary protocol

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On Tue, Nov 5, 2013 at 1:23 PM, Kevin Wooten <kdubb(at)me(dot)com> wrote:

> Am I remembering incorrectly? I thought the driver only used binary for
> a few types and only if it was a prepared statement?
>
> On Nov 5, 2013, at 11:18 AM, Dave Cramer <pg(at)fastcrypt(dot)com> wrote:
>
> I'd still like to know just how bad the getXXX calls are. Also the driver
> is using binary protocol, so it shouldn't be that bad
>
> Dave Cramer
>
> dave.cramer(at)credativ(dot)ca
> http://www.credativ.ca
>
>
> On Tue, Nov 5, 2013 at 1:16 PM, Kevin Wooten <kdubb(at)me(dot)com> wrote:
>
>> Dave,
>>
>> I don’t believe he is not asking about the query time. He is
>> specifically asking about the cost of using JDBC vs whatever pgAdmin is
>> using (which seems to be libpq).
>>
>>
>> On Nov 5, 2013, at 11:08 AM, Dave Cramer <pg(at)fastcrypt(dot)com> wrote:
>>
>> Rui,
>>
>> The best way to settle this is to turn on log_duration in the server and
>> get the server timings. That will give you the query time without rendering
>>
>> Dave Cramer
>>
>> dave.cramer(at)credativ(dot)ca
>> http://www.credativ.ca
>>
>>
>> On Tue, Nov 5, 2013 at 1:05 PM, Rui Pedro Leal <rui(dot)pedro(dot)leal(at)gmail(dot)com>wrote:
>>
>>> Hello Edson.
>>>
>>> I'm running everything locally on a laptop. I simply connect to PG via
>>> JDBC localhost. Just a simple development setup.
>>>
>>> [] Rui Pedro Leal
>>> [] rui(dot)pedro(dot)leal(at)gmail(dot)com
>>>
>>> On 2013/11/05, at 17:47, Edson Richter wrote:
>>>
>>> Are you using a network between your station and the server?
>>> You're connection using IP address or server name?
>>> If the client machine has to access server several times (like in a
>>> cursor scenario), would be necessary to solve machine names, which could
>>> slow down the whole process.
>>>
>>> Regards,
>>>
>>> Edson
>>>
>>>
>>> Em 05/11/2013 15:42, Dave Cramer escreveu:
>>>
>>> Kevin,
>>>
>>> Ya, however I am still questioning the basic time measurement.
>>>
>>> Dave Cramer
>>>
>>> dave.cramer(at)credativ(dot)ca
>>> http://www.credativ.ca
>>>
>>>
>>> On Tue, Nov 5, 2013 at 12:39 PM, Kevin Wooten <kdubb(at)me(dot)com> wrote:
>>>
>>>> Dave,
>>>> That is not necessarily true. Most of the drivers data conversion is
>>>> done in the getXXX calls of the ResultSet. All of which would count as
>>>> “penalty”. I’m not saying it’s a large penalty, just that this is time
>>>> that must be considered since pgAdmin is converting and displaying the data
>>>> as well.
>>>>
>>>> On Nov 5, 2013, at 10:34 AM, Dave Cramer <pg(at)fastcrypt(dot)com> wrote:
>>>>
>>>> Rui Leal,
>>>>
>>>> You shouldn't see that much of a penatly, but I question how you are
>>>> measuring it. Since you are timing the loop which is using cursors.
>>>>
>>>> Just run a simple query and time it without iterating through the
>>>> result set. That would be equivalent
>>>>
>>>> Dave Cramer
>>>>
>>>> dave.cramer(at)credativ(dot)ca
>>>> http://www.credativ.ca
>>>>
>>>>
>>>> On Tue, Nov 5, 2013 at 12:16 PM, Rui Pedro Leal <
>>>> rui(dot)pedro(dot)leal(at)gmail(dot)com> wrote:
>>>>
>>>>> Hello Dave, thanks for the quick reply.
>>>>>
>>>>> I'm just executing the same query on pgAdmin query interface. Just a
>>>>> way to measure performance of the query + getting results vs the JDBC way.
>>>>>
>>>>> I'm using fetchsize in order to get some faster results when
>>>>> iterating the resultSet. I've tried not to use fetchSize/cursor but the
>>>>> performance is naturally slower (and worser than the simple pgAdmin query).
>>>>>
>>>>> What i'm asking is .. is this performance penalty i'm getting using
>>>>> postgres JDBC normal and within what's expected? I'm i doing something
>>>>> wrong? Can i do it in any other way?
>>>>>
>>>>> Regards,
>>>>>
>>>>> Rui Leal
>>>>>
>>>>> [] Rui Pedro Leal
>>>>> [] rui(dot)pedro(dot)leal(at)gmail(dot)com
>>>>>
>>>>>
>>>>> On 2013/11/05, at 16:43, Dave Cramer wrote:
>>>>>
>>>>> How can you measure the same query in pgAdmin ? JDBC will use a
>>>>> cursor when you set the fetchsize.
>>>>>
>>>>> Dave Cramer
>>>>>
>>>>> dave.cramer(at)credativ(dot)ca
>>>>> http://www.credativ.ca
>>>>>
>>>>>
>>>>> On Tue, Nov 5, 2013 at 11:12 AM, Rui Pedro Leal <
>>>>> rui(dot)pedro(dot)leal(at)gmail(dot)com> wrote:
>>>>>
>>>>>> Just an quick info update: i'm using the postgresql-9.2-1003-jdbc4
>>>>>> driver
>>>>>>
>>>>>> On 2013/11/05, at 16:09, Rui Pedro Leal wrote:
>>>>>>
>>>>>> > Hello everyone,
>>>>>> >
>>>>>> > I'm having slow JDBC performance on iterating a resultSet obtained
>>>>>> from a somewhat simple query.
>>>>>> >
>>>>>> > The query, although using PostGIS functions, is pretty straight
>>>>>> forward:
>>>>>> >
>>>>>> > SELECT _id, _spatial_type, ST_AsBinary(_geometry), _attribute,
>>>>>> _count, _references, countpersons, countfatals, countdrunks, density
>>>>>> > FROM accidents_5
>>>>>> > WHERE (_geometry &&
>>>>>> ST_MakeEnvelope(-126.60644531250001,26.43122806450644,-63.369140625,52.96187505907603,
>>>>>> 4326) )
>>>>>> > ORDER BY _pk_id
>>>>>> >
>>>>>> > The results are the following:
>>>>>> > - Executing the query directly from pgAdmin: ~2807ms
>>>>>> > - Executing from JVM + JDBC: 4184ms
>>>>>> >
>>>>>> > The code i'm executing is pretty much standard:
>>>>>> >
>>>>>> > -------------------- code --------------------
>>>>>> >
>>>>>> > public static void main(final String[] args) throws Exception
>>>>>> {
>>>>>> >
>>>>>> > String sql = "SELECT _id, _spatial_type,
>>>>>> ST_AsBinary(_geometry), _attribute, _count, _references, countpersons,
>>>>>> countfatals, countdrunks, density";
>>>>>> > sql += " FROM accidents_5";
>>>>>> > sql += " WHERE (_geometry &&
>>>>>> ST_MakeEnvelope(-126.60644531250001,26.43122806450644,-63.369140625,52.96187505907603,
>>>>>> 4326) )";
>>>>>> > sql += " ORDER BY _pk_id";
>>>>>> >
>>>>>> > System.out.println(sql);
>>>>>> >
>>>>>> > Long time = System.currentTimeMillis();
>>>>>> > try {
>>>>>> > Connection connection =
>>>>>> DataStoreInfo.getDataStores().get(0);
>>>>>> > connection.setAutoCommit(false);
>>>>>> >
>>>>>> > System.out.println("[QUERY " +
>>>>>> Thread.currentThread().getId()
>>>>>> > + "] - connection in "
>>>>>> > + (System.currentTimeMillis()
>>>>>> - time) + "ms.");
>>>>>> >
>>>>>> > Statement st = connection.createStatement(
>>>>>> > ResultSet.TYPE_FORWARD_ONLY,
>>>>>> ResultSet.CONCUR_READ_ONLY,
>>>>>> >
>>>>>> ResultSet.CLOSE_CURSORS_AT_COMMIT);
>>>>>> > st.setFetchSize(250);
>>>>>> >
>>>>>> > System.out.println("[QUERY " +
>>>>>> Thread.currentThread().getId()
>>>>>> > + "] - statement in " +
>>>>>> (System.currentTimeMillis() - time)
>>>>>> > + "ms.");
>>>>>> >
>>>>>> > ResultSet resultSet = st.executeQuery(sql);
>>>>>> >
>>>>>> > System.out.println("[QUERY " +
>>>>>> Thread.currentThread().getId()
>>>>>> > + "] - done in " +
>>>>>> (System.currentTimeMillis() - time)
>>>>>> > + "ms.");
>>>>>> >
>>>>>> > GeometryFactory geofact = new
>>>>>> GeometryFactory(new PrecisionModel(),4326);
>>>>>> > WKBReader wkbReader = new WKBReader(geofact);
>>>>>> >
>>>>>> > String id;
>>>>>> > String spatialType;
>>>>>> > Geometry geometry;
>>>>>> > String attribute;
>>>>>> > Long count;
>>>>>> > String reference;
>>>>>> > Map<String, Double> properties;
>>>>>> > String granularSynthString =
>>>>>> "GranularSynthesis";
>>>>>> >
>>>>>> > while (resultSet.next()) {
>>>>>> > id = resultSet.getString(1);
>>>>>> > spatialType = resultSet.getString(2);
>>>>>> >
>>>>>> > // geometry =
>>>>>> wkbReader.read(resultSet.getBytes(3)); // ignored the WKBReader and the
>>>>>> results are about the same.
>>>>>> > attribute = resultSet.getString(4);
>>>>>> > count = resultSet.getLong(5);
>>>>>> > reference = resultSet.getString(6);
>>>>>> >
>>>>>> > properties = new HashMap<String,
>>>>>> Double>();
>>>>>> > Double aux = resultSet.getDouble(7);
>>>>>> > properties.put("countpersons", aux);
>>>>>> > aux = resultSet.getDouble(8);
>>>>>> > properties.put("countfatals", aux);
>>>>>> > aux = resultSet.getDouble(9);
>>>>>> > properties.put("countdrunks", aux);
>>>>>> > aux = resultSet.getDouble(10);
>>>>>> > properties.put("density", aux);
>>>>>> > }
>>>>>> >
>>>>>> > System.out.println("[QUERY " +
>>>>>> Thread.currentThread().getId()
>>>>>> > + "] - done & iterated in "
>>>>>> > + (System.currentTimeMillis()
>>>>>> - time) + "ms.");
>>>>>> >
>>>>>> > resultSet.close();
>>>>>> > st.close();
>>>>>> > connection.commit();
>>>>>> > connection.close();
>>>>>> > } catch (SQLException exception) {
>>>>>> > exception.printStackTrace();
>>>>>> > // } catch (ParseException exception) { // ignored
>>>>>> from WKBreader
>>>>>> > // exception.printStackTrace();
>>>>>> > }
>>>>>> >
>>>>>> > System.out.println("[End " +
>>>>>> Thread.currentThread().getId()
>>>>>> > + "] - done in " +
>>>>>> (System.currentTimeMillis() - time) + "ms.");
>>>>>> > }
>>>>>> >
>>>>>> > -------------------- end code --------------------
>>>>>> >
>>>>>> > Although i'm executing this on slow MacbookPro (2.6 core duo, 2Gbs
>>>>>> RAM but SSD) and have a 9.1.2 postgres, i've also tested this on a recent
>>>>>> retina MBP and the ratio between pgAdmin and JDBC execution is similar.
>>>>>> >
>>>>>> > Is this expected? Can someone point if i'm doing something terrible
>>>>>> wrong?
>>>>>> >
>>>>>> > I'm not concerned about the query performance per-se (i know it CAN
>>>>>> be optimized), but the differences just using JDBC and iterating the
>>>>>> resultSet are really annoying.
>>>>>> >
>>>>>> > Thanks in advance for any help.
>>>>>> >
>>>>>> > Kind regards,
>>>>>> >
>>>>>> > Rui Leal
>>>>>> >
>>>>>> >
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Sent via pgsql-jdbc mailing list (pgsql-jdbc(at)postgresql(dot)org)
>>>>>> To make changes to your subscription:
>>>>>> http://www.postgresql.org/mailpref/pgsql-jdbc
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>
>

In response to

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Rui Pedro Leal 2013-11-05 18:27:04 Re: JDBC slow performance on resultSet iteration?
Previous Message Kevin Wooten 2013-11-05 18:23:45 Re: JDBC slow performance on resultSet iteration?